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1.1 GENERAL DESCRIPTION 

The ADSP-21020 and ADSP-21010 are the two members of Analog 
Devices' ADSP-21000 family of floating-point digital signal processors 
(DSPs). The ADSP-21000 family architecture further addresses the five 
central requirements for DSPs established in the ADSP-2100 family of 16- 
bit fixed-point DSPs: 

• Fast, flexible arithmetic computation units 

• Unconstrained data flow to and from the computation units 

• Extended precision and dynamic range in the computation units 

• Dual address generators 

• Efficient program sequencing 

Fast, Flexible Arithmetic. The ADSP-21020/21010 executes all 
instructions in a single cycle. It provides both one of the fastest cycle times 
available and the most complete set of arithmetic operations, including 
Seed 1 /X, Seed 1 / VX, Min, Max, Clip, Shift and Rotate, in addition to the 
traditional multiplication, addition, subtraction and combined addition/ 
subtraction. It is IEEE floating-point compatible and allows either 
interrupt on arithmetic exception or latched status exception handling. 

Unconstrained Data Flow. The ADSP-21020/21010 has a Harvard 
architecture combined with a 10-port data register file. In every cycle: 

• Two operands can be read or written off-chip to or from the register 
file, 

• Two operands can be supplied to the ALU, 

• Two operands can be supplied to the multiplier, and 

• Two results can be received from the ALU and multiplier (three, if the 
ALU operation is a combined addition/subtraction). 

The processors' 48-bit orthogonal instruction word supports fully parallel 
data transfer and arithmetic operations in the same instruction. 
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40-Bit Extended Precision. The ADSP-21020 and ADSP-21010 handle 
32-bit IEEE floating-point format as well as 32-bit integer and fractional 
formats (twos-complement and unsigned), while the ADSP-21020 also 
handles extended-precision 40-bit IEEE floating-point format. The 
ADSP-21020 carries extended precision throughout its computation units, 
limiting intermediate data truncation errors. When working with data 
on-chip, the extended-precision 32-bit mantissa can be transferred to and 
from all computation units. The 40-bit data bus may be extended off-chip 
as desired. The fixed-point formats have an 80-bit accumulator for true 
32-bit fixed-point computations. 

Dual Address Generators. The ADSP-21020/21010 has two data address 
generators (DAGs) that provide immediate or indirect (pre- and post- 
modify) addressing. Modulus and bit-reverse operations are supported 
with no constraints on buffer placement. 

Efficient Program Sequencing. In addition to zero-overhead loops, the 
ADSP-21020/21010 supports single-cycle setup and exit for loops. Loops 
are both nestable (six levels in hardware) and interruptable. The processor 
supports both delayed and non-delay ed branches. 

1 .1 ,1 Key Enhancements 

The ADSP-21000 family enhances the core DSP architecture to enable 
easier system development. The enhancements occur in four key areas: 

• Architectural features for high-level language and operating system 
support. 

• Access to serial scan path (IEEE 1149.1 compatible) and on-chip 
emulation features. 

• Support of IEEE floating-point formats. 

• Open memory system. 

High Level Languages. The ADSP-21000 family architecture has several 
features which directly support high-level language compilers and 
operating systems: 

• General purpose data and address register files, 

• 32-bit native data types, 

• Large address spaces (16M words in program memory, 4G words in 
data memory), 

• Pre- and post-modify addressing, 

• Unconstrained circular buffer placement, and 

• On-chip program, loop, and interrupt stacks. 



Additionally, the ADSP-21000 family architecture is designed specifically 
to support ANSI standard Numerical C — the first compiled language to 
support vector data types and operators for numeric and signal 
processing. 

Serial Scan and Emulation Features. The ADSP-21 020/21 010 supports 
the IEEE-standard PI 149 Joint Test Action Group (JTAG) standard for 
system test. This standard defines a method for serially scanning the I/O 
status of each component in a system. This serial port is also used to gain 
access to the ADSP-21 020/21 010 on-chip emulation features. 

IEEE Formats. The ADSP-21 020/21 010 supports IEEE floating-point data 
formats. This means that algorithms developed on IEEE-compatible 
processors and workstations are portable across processors without 
concern for possible instability introduced by biased rounding or 
inconsistent error handling. 

Open Memory System. No on-chip memory is included on the 
ADSP-21020/21010 (aside from a high-performance cache) specifically to 
avoid artificially constraining the development and upgrade of floating- 
point signal processing applications. This approach also facilitates the use 
of high-level languages and multitasking operating systems. 

1 .1 .2 Why Floating-Point? 

A processor's data format determines its ability to handle signals of 
differing precision, dynamic range, and signal-to-noise ratios. However, 
ease-of-use and time-to-market considerations are often equally 
important. 

Precision. The precision of converters has been increasing and will 
continue to increase. In the past several years, average precision 
requirements have increased by 3 bits. A 20-bit audio A/D converter is 
now available from Analog Devices, and the trend is for both precision 
and sampling rates to increase. 

Dynamic Range. Compression and decompression algorithms have 
traditionally operated on signals of known bandwidth. These algorithms 
were developed to behave regularly, to keep costs down and 
implementations easy. Increasingly, however, the trend in algorithm 
development is not to constrain the regularity and dynamic range of 
intermediate results. Adaptive filtering and imaging are two applications 
requiring wide dynamic range. 
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Signal-to-Noise Ratio. Radar, sonar and even commercial applications 
like speech recognition require wide dynamic range in order to discern 
selected signals from noisy environments. 

Ease-of-Use. In general, floating-point digital signal processors are easier 
to use and allow a quicker time-to-market than processors that do not 
support floating-point formats. The extent to which this is true depends 
on the floating-point processor's architecture. Consistency with IEEE 
workstation simulations and the elimination of scaling are two clear ease- 
of-use advantages. High-level language programmability, large address 
spaces, and wide dynamic range allow system development time to be 
spent on algorithms and signal processing problems rather than assembly 
coding, code paging, and error handling. 

1 .1 .3 Future Product Migration Path 

Analog Devices offers the ADSP-21000 family architecture as the highest 
performance for signal processing applications. Future processors based 
on this architecture will offer higher speed and feature integration, 
incorporating both internal memory and I/O peripherals on-chip. 


1 .2 ARCHITECTURE OVERVIEW 

The following sections summarize the basic features of the ADSP-21020/ 
21010 architecture. These features are described in more detail in 
succeeding chapters. Figure 1.1 shows a block diagram of the ADSP-21020 
with it's 40-bit data paths. 

1.2.1 Computation Units 

The ADSP-21020/21010 contains three independent computation units: an 
ALU, a multiplier with fixed-point accumulator, and a shifter. For meeting 
a wide variety of processing needs, the computation units process data in 
three formats: 32-bit fixed-point, 32-bit floating-point and 40-bit floating- 
point (ADSP-21020 only). The floating-point operations are single- 
precision IEEE-compatible. The 32-bit floating-point format is the 
standard IEEE format, whereas the 40-bit IEEE extended-precision format 
has eight more LSBs of mantissa for additional accuracy. 

The ALU performs a standard set of arithmetic and logic operations in 
both fixed-point and floating-point formats. The multiplier performs 
floating-point and fixed-point multiplication as well as fixed-point 
multiply/add and multiply/subtract operations. The shifter performs 
logical and arithmetic shifts, bit manipulation, field deposit and extraction 
and exponent derivation operations on 32-bit operands. 
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Figure 1.1 ADSP-21 020 Block Diagram 


The computation units perform single-cycle operations; there is no 
computation pipeline. The units are connected in parallel rather than 
serially. The output of any unit may be the input of any unit on the next 
cycle. In a multifunction computation, the ALU and multiplier perform 
independent simultaneous operations. A 10-port register file is used for 
transferring data between the computation units and the data buses, and 
for storing intermediate results. The register file has two sets (primary and 
alternate) of sixteen registers each, for fast context switching. The registers 
are 32 bits wide on the ADSP-21 010 and 40 bits wide on the ADSP-21 020. 
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1 .2.2 Address Generators And Program Sequencer 

Two dedicated address generators and a program sequencer supply 
addresses for memory accesses. Together the sequencer and data address 
generators allow computational operations to execute with maximum 
efficiency because the computation units can be devoted exclusively to 
processing data. Because of its instruction cache, the ADSP-21020/21010 
can simultaneously fetch an instruction and access data in both program 
memory and data memory. 

The data address generators (DAGs) provide memory addresses when 
external memory data is transferred over the parallel memory ports to or 
from internal registers. Dual data address generators enable the processor 
to output simultaneous addresses for dual operand reads and writes. 
DAG1 supplies 32-bit addresses to data memory. DAG2 supplies 24-bit 
addresses to program memory for program memory data accesses. 

Each DAG keeps track of up to eight address pointers, eight modifiers and 
eight length values. A pointer used for indirect addressing can be 
modified by a value in a specified register, either before (pre-modify) or 
after (post-modify) the access. A length value may be associated with each 
pointer to implement automatic modulo addressing for circular buffers, 
which can be located on arbitrary boundaries. Each DAG register has an 
alternate register that can be activated for fast context switching. 

The program sequencer supplies instruction addresses to the program 
memory. It controls loop iterations and evaluates conditional instructions. 
With an internal loop counter and loop stack, the ADSP-21020/21010 
executes looped code with zero overhead. No explicit jump instructions 
are required to loop or to decrement and test the counter. 

The ADSP-21020/21010 achieves its fast program execution rate by means 
of pipelined fetch, decode and execute cycles. External memories have more 
time to complete an access than if there were no decode cycle; 
consequently, ADSP-21020/21010 systems can be built using slower and 
therefore less expensive memories. 

The program sequencer includes a 32-word instruction cache. The cache 
allows the ADSP-21020/21010 to perform a program memory data access 
and execute the corresponding instruction in the same cycle, without any 
delay. The program sequencer fetches the instruction from the cache 
instead of program memory so that the processor can simultaneously 
access data in program memory. Only the instructions whose fetches 
conflict with program memory data accesses are cached. 
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1.2.3 Interrupts 

The ADSP-21020/21010 has five external hardware interrupts (four 
general-purpose interrupts and a special interrupt for reset), nine 
internally generated interrupts and eight software interrupts. For the 
general-purpose external interrupts and the internal timer interrupt, the 
processor automatically stacks the arithmetic status and mode (MODE1) 
registers in parallel with servicing the interrupt, allowing four nesting 
levels of very fast service for these interrupts. 

1.2.4 Timer 

The programmable interval timer provides periodic interrupt generation. 
When enabled, the timer decrements a 32-bit count register every cycle. 
When this count register reaches zero, the ADSP-21020/21010 generates 
an interrupt and asserts its TIMEXP output. The count register is 
automatically reloaded from a 32-bit period register and the count 
resumes immediately. 

1 .2.5 Memory Buses And Interface 

The external memory interface supports memory-mapped peripherals and 
slower memories with a user-defined combination of programmable wait 
states and hardware acknowledge signals. Both program memory and 
data memory addressing support page mode addressing of static column 
DRAMs. 

The processor has four internal buses: the program memory address 
(PMA) and data memory address (DMA) buses are used for the addresses 
associated with program and data memory. The program memory data 
(PMD) and data memory data (DMD) buses are used for the data 
associated with the memory spaces. These buses are extended off chip. 

The DMS and PMS signals select data memory and program memory, 
respectively. 

The program memory address (PMA) bus is 24 bits wide allowing direct 
access of up to 16M words of mixed instruction code and data. The 
program memory data (PMD) bus is 48 bits wide to accommodate the 
48-bit instruction width. Fixed-point and single-precision floating-point 
data is aligned to the upper 32 bits of the PMD bus. 

The data memory address (DMA) bus is 32 bits wide allowing direct 
access of up to 4G words of data. The data memory data (DMD) bus is 40 
bits wide on the ADSP-21020 and 32 bits wide on the ADSP-21010. On the 
ADSP-21020, fixed-point and single-precision floating-point data is 
aligned to the upper 32 bits of the DMD bus. The DMD bus provides a 


path for the contents of any register in the processor to be transferred to 
any other register or to any external data memory location in a single 
cycle. The data memory address comes from two sources: an absolute 
value specified in the instruction code (direct addressing) or the output of 
a data address generator (indirect addressing). 

External devices can gain control of memory buses from the ADSP-21020/ 
21010 with bus request/grant signals (BK and BG). To grant its buses in 
response to a bus request, the ADSP-21020/ 21 010 halts internal operations 
and places its program and data memory interfaces in a high-impedance 
state. In addition, three-state controls (DMTS and PMT5) allow an external 
device to place either program or data memory interface in a high- 
impedance state without affecting the other interface and without halting 
the processor unless it requires a program memory access. 

1 .2.6 Internal Data Transfers 

Nearly every internal register of the ADSP-21020/21010 is classified as a 
universal register . ADSP-21020/21010 instructions provide for transferring 
data between any two universal registers or between a universal register 
and external memory. This includes control registers and status registers, 
as well as the data registers in the register file. 

The PX registers permit data to be passed between the 48-bit PMD bus 
and the 40-bit DMD bus or between the 40-bit register file and the PMD 
bus. These registers contain hardware to handle the 8-bit width difference. 

1.2.7 Context Switching 

Many of the processor's registers have alternate registers that can be 
activated during interrupt servicing to facilitate a fast context switch. The 
data registers in the register file, DAG registers and the multiplier result 
register all have alternates. Registers active at reset are called primary 
registers, and the others are alternate registers. Bits in a mode control 
register determine the registers that are active at any particular time. 

1.2.8 Instruction Set 

The ADSP-21000 family instruction set provides a wide variety 
programming capabilities. Multifunction instructions enable computations 
in parallel with data transfers, as well as simultaneous multiplier and 
ALU operations. The addressing power of the ADSP-21020/21010 gives 
you flexibility in moving data both internally and externally. Every 
instruction can be executed in a single processor cycle. The ADSP-21000 
family assembly language uses an algebraic syntax for ease of coding and 
readability. A comprehensive set of development tools supports program 
development. 



1.3 DEVELOPMENT SYSTEM 

The ADSP-21020/21010 is supported with a complete set of software and 
hardware development tools. The ADSP-21000 Family Development 
System includes software tools for programming and debugging as well 
as in-circuit emulators for system integration and debugging. 

Figure 1.2 shows the process of developing an application using the 
development tools. File name extensions (.ASM, .OBJ, etc.) at the input 
and output of each step signify different types of files. 



o 

o 


USER FILE OR HARDWARE 

SOFTWARE DEVELOPMENT TOOL 


HARDWARE DEVELOPMENT TOOL 


Figure 1.2 DSP System Development 





The development system includes the following: 

C Compiler & Runtime C Library. The C Compiler reads source files 
written in ANSI-standard C language. The compiler outputs ADSP-21xxx 
assembly language files. It comes with a standard library of C-callable 
routines. 

Numerical C Compiler. DSP/C™ is Analog Devices' implementation of 
ANSI-standard Numerical C — a set of extensions to C that allow matrix 
data types and operators. The compiler outputs ADSP-21xxx assembly 
language files. With DSP/ C, signal processing algorithms are easier to 
program and the compiled code is more efficient because the compiler 
directly translates matrix operations in Numerical C to the matrix 
capabilities of the ADSP-21020/21010. 

Assembler. The assembler inputs a file of ADSP-21xxx source code and 
assembler directives and outputs a relocatable object file. The assembler 
supports standard C preprocessor directives as well as its own directives. 

Linker. The linker processes separately assembled object and library files 
to create a single executable program. It assigns memory locations to code 
and data in accordance with a user-defined architecture file, a text file that 
describes the memory configuration of the target system. 

Assembly Library/Librarian. The assembly library contains standard 
arithmetic and DSP routines that can be called from your program, saving 
development time. You can add your own routines to this library using 
the librarian function. 

Simulator. The simulator executes an ADSP-21020/21010 program in 
software in the same way that the processor would in hardware. The 
simulator also simulates the memory and I/O devices specified in the 
architecture file. The simulator's window-based user interface lets you 
interactively observe and alter data contained in the processor's registers 
and in memory. 

PROM Splitter. The PROM splitter translates an ADSP-21xxx executable 
program into one of several formats (Motorola S2 and S3, Intel Hex 
Record, etc.) that can be used to configure a PROM or be downloaded to a 
target from a microcontroller. 
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In-Circuit Emulator. The EZ-ICE® emulator provides hardware 
debugging capabilities for ADSP-21020/21010 systems with stand-alone 
in-circuit emulation, running the target board processor in self-emulation 
mode. The emulator design allows program execution with little or no 
degradation in processor performance. 

The emulator features the same window-based user interface as the 
simulator, for ease-of-use and faster development cycles. The emulator 
communicates with the target processor through the processor's JTAG test 
access port. This 7-wire interface allows for a probe that is smaller and less 
intrusive than a traditional full-pinout emulator-to-target connector. 

1.4 MANUAL ORGANIZATION 

The chapters of this manual are organized as follows: 

Chapter 2, Computation Units. Describes the capabilities and operation 
of the ALU, multiplier and shifter. 

Chapter 3, Program Sequencing. Describes the processor's features for 
executing various types of program structures: subroutines, loops and 
interrupt service routines. Also describes the operation of the instruction 
cache and the handling of interrupts. 

Chapter 4, Data Addressing. Describes how to use the data address 
generators to address data in data memory and program memory. 

Chapter 5, Timer. Describes the operation of the programmable interval 
timer. 

Chapter 6, Memory Interface. Describes how the processor accesses 
external data and program memory. Also describes the memory 
management features of the processor. 

Chapter 7, Instruction Summary. An overview of the ADSP-21000 family 
instruction set. Use this chapter as a reference when writing programs in 
assembly language. The chapter also contains a summary of programming 
reminders and restrictions. 

Chapter 8, Assembly Programming Tutorial. Presents two examples of 
ADSP-21020/21010 programs and describes in detail how they were 
written. Describes many techniques to take advantage of the processors' 
architecture and instruction set. 
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Chapter 9, Hardware System Design. Presents numerous system 
diagrams based on the ADSP-21020. Hardware considerations such as 
clocking, reset, flags, capacitive loading and emulator access are also 
addressed. Examples of a program memory boot at reset and a host 
interface are shown. 

Appendix A, Instruction Set Reference. Describes each instruction in 
detail. Also details the instruction opcodes. The compute portion of 
instructions are described in Appendix B. 

Appendix B, Compute Operation Reference. Describes each compute 
operation and its opcode field in detail. 

Appendix C, IEEE 1149.1 JTAG Test Access Port. Describes the features 
and operation of the IEEE 1149.1 (JTAG) test access port. 

Appendix D, Numeric Formats. Shows all the floating-point and fixed- 
point data formats supported by the ADSP-21020/21010. 

Appendix E, Control/Status Registers. Summarizes the contents of all 
ADSP-21020/21010 registers that contain control and/or status bits. Also 
describes bit manipulation operations available. 
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Computation Units 


2.1 OVERVIEW 

The computation units of the ADSP-21020 and ADSP-21010 provide the 
numeric processing power for performing DSP algorithms. The 
ADSP-21 020/21 010 contains three computation units: an arithmetic /logic 
unit (ALU), a multiplier and a shifter. Both fixed-point and floating-point 
operations are supported by the processor. Each computation unit 
executes instructions in a single cycle. 

The ALU performs a standard set of arithmetic and logic operations in 
both fixed-point and floating-point formats. The multiplier performs 
floating-point and fixed-point multiplication as well as fixed -point 
multiply/add and multiply /subtract operations. The shifter performs 
logical and arithmetic shifts, bit manipulation, field deposit and extraction 
operations on 32-bit operands and can derive exponents as well. 

The computation units are architecturally arranged in parallel, as shown 
in Figure 2.1 on the next page. The output of any computation unit may be 
the input of any computation unit on the next cycle. The computation 
units input data from and output data to a 10-port register file that 
consists of sixteen primary registers and sixteen alternate registers. The 
register file is accessible to the ADSP-21 020/21 010 program and data 
memory data buses for transferring data between the computation units 
and external memory or other parts of the processor. 

The individual registers of the register file are prefixed with an "f" when 
used in floating-point computations (in assembly language source code). 
The registers are prefixed with an "r" when used in fixed-point 
computations. The following instructions, for example, use the same 
registers: 


F0=F1 *F2; floating-point multiply 

R0=R1 * R2 ; fixed-point multiply 

The "f" and "r" prefixes do not affect the 40-bit (or 32-bit) data transfer; 
they only determine how the ALU, multiplier, or shifter treat the data. 



PMD BUS 



Figure 2.1 Computation Units 

This chapter covers the following topics: 

• Data Formats and Rounding 

• ALU Architecture and Functions 

• Multiplier Architecture and Functions 

• Shifter Architecture and Functions 

• Multifunction Computations 

• Register File and Data Transfers 


2.2 IEEE FLOATING-POINT OPERATIONS 

The ADSP-21020/21010 multiplier and ALU support the single-precision 
floating-point format specified in the IEEE 754/854 standard. This 
standard is described in Appendix D. The ADSP-21020/21010 is 
IEEE 754/854 compatible for single-precision floating-point operations in 
all respects except that: 

• The ADSP-21020/21010 does not provide inexact flags. 

• NAN ("Not-A-Number") inputs generate an invalid exception and 
return a quiet NAN (all Is). 
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• Denormal operands are flushed to zero when input to a computation 
unit and do not generate an underflow exception. Any denormal or 
underflow result from an arithmetic operation is flushed to zero and 
an underflow exception is generated. 

• Round-to-nearest and round-toward-zero modes are supported. 
Rounding to +Infinity and rounding to -Infinity are not supported. 

In addition, the ADSP-21020 supports a 40-bit extended precision floating- 
point mode, which has eight additional LSBs of the mantissa and is 
compliant with the 754/854 standards; however, results in this format are 
more precise than the IEEE single-precision standard specifies. The ADSP- 
21010 does not offer this 40-bit format. 

2.2.1 Extended Floating-Point Precision (ADSP-21020 Only) 

Floating-point data can be either 32 or 40 bits wide on the ADSP-21020. 
Extended precision floating-point format (8 bits of exponent and 32 bits of 
mantissa) is selected if the RND32 bit in the MODE1 register is cleared (0). 
If this bit is set (1), then normal IEEE precision is used (8 bits exponent 
and 24 bits of mantissa). In this case, the computation unit sets the eight 
LSBs of floating-point inputs to zeros before performing the operation. 

The mantissa of a result is rounded to 23 bits (not including the hidden 
bit) and the 8 LSBs of the 40-bit result are set to zeros to form a 32-bit 
number that is equivalent to the IEEE standard result. 

On the ADSP-21010, the RND32 bit must be set to 1 at system powerup (at 
the beginning of your program). 

2.2.2 Floating-Point Exceptions 

The multiplier and ALU each provide exception information when 
executing floating-point operations. Each unit updates overflow, 
underflow and invalid operation flags in the arithmetic status (AST AT) 
register and in the sticky status (STKY) register. An underflow, overflow 
or invalid operation from any unit also generates a maskable interrupt. 
Thus, there are three ways to handle floating-point exceptions: 

• Interrupts. The exception condition is handled immediately in an 
interrupt service routine. You would use this method if it was 
important to correct all exceptions as they happen. 

• AST AT register. The exception flags in the AST AT register pertaining 
to a particular arithmetic operation are tested after the operation is 
performed. You would use this method to monitor a particular 
floating-point operation. 
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• STKY register. Exception flags in the STKY register are examined at the 
end of a series of operations. If any flags are set, some of the results are 
incorrect. You would use this method if exception handling was not 
critical. 

2.3 FIXED-POINT OPERATIONS 

Fixed-point numbers are always represented in 32 bits and are left- 
justified (occupy the 32 MSBs) in the 40-bit data fields of the ADSP-21020. 
They may be treated as fractional or integer numbers and as unsigned or 
twos-complement. Each computation unit has its own limitations on how 
these formats may be mixed for a given operation. The computation units 
read 32-bit operands from 40-bit registers, ignoring the 8 LSBs, and write 
32-bit results, zeroing the 8 LSBs. 


2.4 ROUNDING 

Two modes of rounding are supported in the ADSP-21020 and 
ADSP-21010: round-to ward-zero and round-toward-nearest. The 
rounding modes follow the IEEE 754 standard definitions, which are 
briefly stated as follows: 

Round-toward-Zero. If the result before rounding is not exactly 
representable in the destination format, the rounded result is that number 
which is nearer to zero. This is equivalent to truncation. 

Round-toward-Nearest. If the result before rounding is not exactly 
representable in the destination format, the rounded result is that number 
which is nearer to the result before rounding. If the result before rounding 
is exactly halfway between two numbers in the destination format 
(differing by an LSB), the rounded result is that number which has an LSB 
equal to zero. Statistically, rounding up occurs as often as rounding down, 
so there is no large sample bias. Because the maximum floating-point 
value is one LSB less than the value that represents Infinity, a result that is 
halfway between the maximum floating-point value and Infinity rounds 
to Infinity in this mode. 

The rounding mode for all ALU operations and for floating-point 
multiplier operations is determined by the TRUNC bit in the MODE1 
register. If the TRUNC bit is set, the round-to-zero mode is selected; 
otherwise, the round-to-nearest mode is used. 
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For fixed-point multiplier operations on fractional data, the same two 
rounding modes are supported, but only the round-to-nearest operation is 
actually performed by the multiplier. Because the multiplier has a local 
result register for fixed-point operations, rounding-to-zero is 
accomplished implicitly by reading only the upper bits of the result and 
discarding the lower bits. 


2.5 ALU 

The ALU performs arithmetic operations on fixed-point or floating-point 
data and logical operations on fixed-point data. ALU fixed-point 
instructions operate on 32-bit fixed-point operands and output 32-bit 
fixed-point results. ALU floating-point instructions operate on 32-bit or 
40-bit floating-point operands and output 32-bit or 40-bit floating-point 
results. 

ALU instructions include: 

• Floating-point addition, subtraction, add/ subtract, average 

• Fixed-point addition, subtraction, add/ subtract, average 

• Floating-point manipulation: binary log, scale, mantissa 

• Fixed-point add with carry, subtract with borrow, increment, 
decrement 

• Logical AND, OR, XOR, NOT 

• Functions: Absolute value, pass, min, max, clip, compare 

• Format conversion 

• Reciprocal and reciprocal square root primitives 

Dual add /subtract and parallel ALU and multiplier operations are 
described under "Multifunction Computations," later in this chapter. 

2.5.1 ALU Operation 

The ALU takes one or two input operands, called the X input and the Y 
input, which can be any data registers in the register file. It usually returns 
one result; in add/ subtract operations it returns two results, and in 
compare operations it returns no result (only flags are updated). ALU 
results can be returned to any location in the register file. 

Input operands are transferred from the register file during the first half of 
the cycle. Results are transferred to the register file during the second half 
of the cycle. Thus the ALU can read and write the same register file 
location in a single cycle. 
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If the ALU operation is fixed-point, the X input and Y input are each 
treated as a 32-bit fixed-point operand. The upper 32 bits from the source 
location in the register file are transferred. For fixed-point operations, the 
result(s) are always 32-bit fixed-point values. Some floating-point 
operations (LOGB, MANT and FIX) can also yield fixed-point results. 
Fixed-point results are transferred to the upper 32 bits of register file. The 
lower eight bits of the register file destination are cleared. 

The format of fixed-point operands and results depends on the operation. 
In most arithmetic operations, there is no need to distinguish between 
integer and fractional formats. Fixed-point inputs to operations such as 
scaling a floating-point value are treated as integers. For purposes of 
determining status such as overflow, fixed-point arithmetic operands and 
results are treated as twos-complement numbers. 

2.5.2 ALU Operating Modes 

The ALU is affected by three mode status bits in the MODE1 register; the 
ALU saturation bit affects ALU operations that yield fixed-point results, 
and the rounding mode and rounding boundary bits affect floating-point 
operations in both the ALU and multiplier. 


MODE1 
Bit Name 
13 ALUSAT 

15 TRUNC 

16 RND32 


Function 

l=Enable ALU saturation (full scale in fixed-point); 
0=No ALU saturation 
l=Truncation; 0=Round to nearest 
l=Round to 32 bits; 0=Round to 40 bits 
(RND32 must be set to 1 on ADSP-21010) 


2.5.2 . 1 Saturation Mode 

In saturation mode, all positive fixed-point overflows cause the maximum 
positive fixed-point number (0x7FFF FFFF) to be returned, and all 
negative overflows cause the maximum negative number (0x8000 0000) to 
be returned. If the ALUSAT bit is set, fixed-point results that overflow are 
saturated. If the ALUSAT bit is cleared, fixed-point results that overflow 
are not saturated; the upper 32 bits of the result are returned unaltered. 
The ALU overflow flag reflects the ALU result before saturation. 


2.52.2 Floating-Point Rounding Modes 

The ALU supports two IEEE rounding modes. If the TRUNC bit is set, the 
ALU rounds a result to zero (truncation). If the TRUNC bit is cleared, the 
ALU rounds to nearest. 



2 



2.52.3 Floating-Point Rounding Boundary 

The results of floating-point ALU operations can be either 32-bit or 40-bit 
floating-point data on the ADSP-21020. If the RND32 bit is set, the eight 
LSBs of each input operand are flushed to zeros before the ALU operation 
is performed (except for the RND operation), and ALU floating-point 
results are output in the 32-bit IEEE format. The lower eight bits of the 
result are cleared. If the RND32 bit is cleared, the ALU inputs 40-bit 
operands unchanged and outputs 40-bit results from floating-point 
operations, and all 40 bits are written to the specified register file location. 

In fixed-point to floating-point conversion, the rounding boundary is 
always 40 bits even if the RND32 bit is set. 

2.5.3 ALU Status Flags 

The ALU updates seven status flags in the AST AT register, shown below, 
at the end of each operation. The states of these flags reflect the result of 
the most recent ALU operation. The ALU updates the Compare 
Accumulation bits in AST AT at the end of every Compare operation. The 
ALU also updates four "sticky" status flags in the STKY register. Once set, 
a sticky flag remains high until explicitly cleared. 


ASTAT 


Bit 

Name 

Definition 

0 

AZ 

ALU result zero or floating-point underflow 

1 

AV 

ALU overflow 

2 

AN 

ALU result negative 

3 

AC 

ALU fixed-point carry 

4 

AS 

ALU X input sign (ABS and MANT operations) 

5 

AI 

ALU floating-point invalid operation 

10 

AF 

last ALU operation was a floating-point operation 

31-24 

CACC 

Compare Accumulation register (results of last 8 
Compare operations) 

STKY 

Bit 

Name 

Definition 

0 

AUS 

ALU floating-point underflow 

1 

AVS 

ALU floating-point overflow 

2 

AOS 

ALU fixed-point overflow 

5 

AIS 

ALU floating-point invalid operation 
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Flag update occurs at the end of the cycle in which the status is generated 
and is available on the next cycle. If a program writes the AST AT register 
or STKY register explicitly in the same cycle that the ALU is performing 
an operation, the explicit write to AST AT or STKY supersedes any flag 
update from the ALU operation. 

2.5.11 ALU Zero Flag (AZ) 

The zero flag is determined for all fixed-point and floating-point ALU 
operations. AZ is set whenever the result of an ALU operation is zero. AZ 
also signifies floating-point underflow; see the next section. It is otherwise 
cleared. 

2.5.12 ALU Underflow Flag (AZ, AUS) 

Underflow is determined for all ALU operations that return a floating- 
point result and for floating-point to fixed-point conversion. AUS is set 
whenever the result of an ALU operation is smaller than the smallest 
number representable in the output format. AZ is set whenever a floating- 
point result is smaller than the smallest number representable in the 
output format. 

2.5.3.3 ALU Negative Flag (AN) 

The negative flag is determined for all ALU operations. It is set whenever 
the result of an ALU operation is negative. It is otherwise cleared. 

2.5.3A ALU Overflow Flag (AV, AOS, AVS) 

Overflow is determined for all fixed-point and floating-point ALU 
operations. For fixed-point results, AV and AOS are set whenever the 
XOR of the two most significant bits is a 1; otherwise AV is cleared. For 
floating-point results AV and AVS are set whenever the post-rounded 
result overflows (unbiased exponent > 127); otherwise AV is cleared. 

2.5.3.5 ALU Fixed-Point Carry Flag (AC) 

The carry flag is determined for all fixed-point ALU operations. For fixed- 
point arithmetic operations, AC is set if there is a carry out of most 
significant bit of the result, and is otherwise cleared. AC is cleared for 
fixed-point logic, PASS, MIN, MAX, COMP, ABS, and CLIP operations. 
The ALU reads the AC flag in fixed-point addition with carry and fixed- 
point subtraction with carry operations. 



2.5.3.6 ALU Sign Flag (AS) 

The sign flag is determined for only the fixed-point and floating-point 
ABS operations and the MANT operation. AS is set if the input operand is 
negative. It is otherwise cleared. The ALU clears AS for all operations 
other than ABS and MANT operations; this is different from the operation 
of ADSP-2100 family processors, which do not update the AS flag on 
operations other than ABS. 

2.5.3. 7 ALU Invalid Flag (Al) 

The invalid flag is determined for all floating-point ALU operations. AI 
and AIS are set whenever 

• an input operand is a NAN 

• an addition of opposite-signed Infinities is attempted 

• a subtraction of like-signed Infinities is attempted 

• when saturation mode is not set, a floating-point to fixed-point 
conversion results in an overflow or operates on an Infinity. 

AI is otherwise cleared. 

2.5.3.S ALU Floating-Point Flag (AF) 

AF is determined for all fixed-point and floating-point ALU operations. It 
is set if the last operation was a floating-point operation; it is otherwise 
cleared. 

2.5.3.S Compare Accumulation 

Bits 31-24 in the AST AT register store the flag results of up to eight ALU 
compare operations. These bits form a right-shift register. When an ALU 
compare operation is executed, the eight bits are shifted toward the LSB 
(bit 24 is lost). The MSB, bit 31, is then written with the result of the 
compare operation. If the X operand is greater than the Y operand in the 
compare instruction, bit 31 is set; it is cleared otherwise. The accumulated 
compare flags can be used to implement 2- and 3-dimensional clipping 
operations for graphics applications. 
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2.5.4 ALU Instruction Summary 


Instruction 
Fixed-point: 
c Rn = Rx + Ry 
c Rn = Rx - Ry 
c Rn = Rx + Ry + Cl 
c Rn = Rx - Ry + Cl - 1 
Rn = (Rx + Ry)/2 
COMP(Rx, Ry) 

Rn = Rx + Cl 
Rn = Rx + Cl - 1 
Rn = Rx + 1 
Rn = Rx - 1 
c Rn = -Rx 
c Rn = ABS Rx 
Rn = PASS Rx 
c Rn = Rx AND Ry 
c Rn = Rx OR Ry 
c Rn = Rx XOR Ry 
c Rn = NOT Rx 
Rn = MIN(Rx, Ry) 

Rn = MAX(Rx, Ry) 

Rn = CLIP Rx BY Ry 
Floating-point: 

Fn = Fx + Fy 
Fn = Fx - Fy 
Fn = ABS (Fx + Fy) 

Fn = ABS (Fx-Fy) 

Fn = (Fx + Fy)/2 
COMP(Fx, Fy) 

Fn = -Fx 

Fn = ABS Fx 

Fn = PASS Fx 

Fn = RND Fx 

Fn = SCALB Fx BY Ry 

Rn = MANT Fx 

Rn = LOGB Fx 

Rn = FIX Fx BY Ry 

Rn = FIX Fx 

Fn = FLOAT Rx BY Ry 

Fn = FLOAT Rx 

Fn = RECIPS Fx 

Fn = RSQRTS Fx 

Fn = Fx COPYSIGN Fy 

Fn = MIN(Fx, Fy) 

Fn = MAX(Fx, Fy) 

Fn = CLIP Fx BY Fy 


AST AT Status Flags 


AZ AV AN 

* * * 

st- * * 

sf- St- * 

* * * 

* 0 * 

* 0 * 

Sf- Sf- Sf- 

Sf- Sf- Sf- 

>f- sf- sf- 

Sf- Sf- Sf- 

Sf- Sf- Sf 

* * 0 

* 0 

* 0 

* 0 

* 0 

* 0 

* 0 

* 0 * 

* 0 


AC AS AI 
* 0 0 
* 0 0 
* 0 0 
* 0 0 
* 0 0 
0 0 0 
* 0 0 
* 0 0 
* 0 0 
* 0 0 
* 0 0 
0*0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
0 0 0 


Sf- Sf- 

* * 

* * 

* * 

* 0 

* 0 

* * 

* * 

* 0 

* * 

* * 

* * 

* * 

* * 

* * 

St- * 

* 0 

Sf- sf- 

St- Sf- 

* 0 

* 0 

* 0 

* 0 


* 0 
* 0 
0 0 
0 0 
* 0 
* 0 
* 0 
0 0 
* 0 
* 0 
* 0 
0 0 
* 0 
* 0 
* 0 
* 0 
* 0 
* 0 
* 0 
* 0 
* 0 
* 0 
* 0 


0 

0 * 
0 

0 * 

0 * 

0 
0 

Sf- Sf- 

0 

0 

0 

Sf- Sf- 

0 

0 

0 

0 0 

0 0 

0 * 

0 

0 * 

0 * 

0 
0 


AF 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


STKY Status Flags 

CACC AUS AYS AOS AIS 


Rn, Rx, Ry = Any register file location; treated as fixed-point 
Fn, Fx, Fy = Any register file location; treated as floating-point 
c = ADSP-21xx-compatible instruction 


* set or cleared, depending on results of instruction 

** may be set (but not cleared), depending on results of instruction 

- no effect 
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2.6 MULTIPLIER 

The multiplier performs fixed-point or floating-point multiplication and 
fixed-point multiply /accumulate operations. Fixed-point multiply/ 
accumulates may be performed with either cumulative addition or 
cumulative subtraction. Floating-point multiply accumulates can be 
accomplished through parallel operation of the ALU and multiplier, using 
multifunction instructions. See "Multifunction Operations," later in this 
chapter. 

Multiplier floating-point instructions operate on 32-bit or 40-bit floating- 
point operands and output 32-bit or 40-bit floating-point results. 

Multiplier fixed-point instructions operate on 32-bit fixed-point data and 
produce 80-bit results. Inputs are treated as fractional or integer, unsigned 
or twos-complement. 

Multiplier instructions include: 

• Floating-point multiplication 

• Fixed-point multiplication 

• Fixed-point multiply/ accumulate with addition, rounding optional 

• Fixed-point multiply/ accumulate with subtraction, rounding optional 

• Rounding result register 

• Saturating result register 

• Clearing result register 

2.6.1 Multiplier Operation 

The multiplier takes two input operands, called the X input and the Y 
input, which can be any data registers in the register file. Fixed-point 
operations can accumulate fixed-point results in either of two local 
multiplier result (MR) registers or write results back to the register file. 
Results stored in the MR registers can also be rounded or saturated in 
separate operations. Floating-point operations yield floating-point results, 
which are always written directly back to the register file. 

Input operands are transferred during the first half of the cycle. Results 
are transferred during the second half of the cycle. Thus the multiplier can 
read and write the same register file location in a single cycle. 

If the multiplier operation is fixed-point, inputs taken from the register file 
are read from the upper 32 bits of the source location. Fixed-point 
operands may be treated as both in integer format or both in fractional 
format. The format of the result is the same as the format of the inputs. 
Each fixed-point operand may be treated as either an unsigned or a twos- 
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complement number. If both inputs are fractional and signed, the 
multiplier automatically shifts the result left one bit to remove the 
redundant sign bit. The input data type is specified within the multiplier 
instruction. 

2.6.2 Fixed-Point Results 

Fixed-point operations yield 80-bit results. The location of a result in the 
80-bit field depends on whether the result is in fractional or integer 
format, as shown in Figure 2.2. If the result is sent directly to the register 
file, the 32 bits that have the same format as the input data are transferred, 
i.e. bits 63-32 for a fractional result or bits 31-0 for an integer result. The 
eight LSBs of the 40-bit register file location are zero-filled. Fractional 
results can be rounaed-to-nearest before being sent to the register file, as 
explained later in this chapter. If rounding is not specified, discarding bits 
31-0 effectively truncates a fractional result (rounds to zero). 


79 

63 

31 


0 

| MR2 

| MR1 


MRO 

S3 


| OVERFLOW 

| FRACTIONAL RESULT 


UNDERFLOW 



| OVERFLOW 

j OVERFLOW 

ZJ— 

INTEGER RESULT 

ZJ 


Figure 2.2 Multiplier Fixed-Point Result Placement 

MR Registers 

The entire result can be sent to one of two dedicated 80-bit result (MR) 
registers. The MR registers have identical format; each is divided into 
MR2, MR1 and MRO registers that can be individually read from or 
written to the register file. When data is read from MR2, it is sign- 
extended to 32 bits (see Figure 2.3). The eight LSBs of the 40-bit register 
file location are zero-filled when data is read from MR2, MR1 or MRO to 
the register file. Data is written into MR2, MR1 or MRO from the 32 MSBs 
of a register file location; the eight LSBs are ignored. Data written to MR1 
is sign-extended to MR2, i.e. the MSB of MR1 is repeated in the 16 bits of 
MR2. Data written to MRO, however, is not sign-extended. 

The two MR registers are designated MRF (foreground) and MRB 
(background); foreground refers to those registers currently activated by 
the SRCU bit in the MODE1 register, and background refers to those that 
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Figure 2.3 MR Transfer Formats 

are not. In the case that only one MR register is used at a time, the SRCU 
bit activates one or the other to facilitate context switching. However, 
unlike other registers for which alternate sets exist, both MR register sets 
are accessible at the same time. All (fixed-point) accumulation instructions 
may specify either result register for accumulation, regardless of the state 
of the SRCU bit. Thus, instead of using the MR registers as a primary and 
an alternate, you can use them as two parallel accumulators. This feature 
facilitates complex math. 

Transfers between MR registers and the register file are considered 
computation unit operations, since they involve the multiplier. Thus, 
although the syntax for the transfer is the same as for any other transfer to 
or from the register file, an MR transfer is placed in an instruction where a 
computation is normally specified. For example, the ADSP-21020 can 
perform a multiply /accumulate in parallel with a read of data memory, as 
in: 

MRF=MRF-R5 *R0 , R6=DM ( II , M2 ) ; 

or it can perform an MR transfer instead of the computation, as in: 

R5=MR1F , R6=DM ( I 1 , M2 ) ; 

2.6.3 Fixed-Point Operations 

In addition to multiplication, fixed-point operations include accumulation, 
rounding and saturation of fixed-point data. There are three MR register 
operations: Clear, Round and Saturate. 
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2 . 6 . 3 . 1 Clear MR Register 

The clear operation resets the specified MR register to zero. This operation 
is performed at the start of a multiply /accumulate operation to remove 
results left over from the previous operation. 

2.6.32 Round MR Register 

Rounding of a fixed-point result occurs either as part of a multiply or 
multiply/ accumulate operation or as an explicit operation on the MR 
register. The rounding operation applies only to fractional results (integer 
results are not affected) and rounds the 80-bit MR value to nearest at bit 
32, i.e. at the MR1-MR0 boundary. The rounded result in MR1 can be sent 
either to the register file or back to the same MR register. To round a 
fractional result to zero (truncation) instead of to nearest, you would 
simply transfer the unrounded result from MR1, discarding the lower 32 
bits in MRO. 

2.6.3.3 Saturate MR Register On Overflow 

The saturate operation sets MR to a maximum value if the MR value has 
overflowed. Overflow occurs when the MR value is greater than the 
maximum value for the data format (unsigned or twos-complement and 
integer or fractional) that is specified in the saturate instruction. There are 
six possible maximum values (shown in hexadecimal): 

MR2 MR1 MRO 

Maximum twos-complement fractional number 
0000 7FFF FFFF FFFF FFFF positive 

FFFF 8000 0000 0000 0000 negative 

Maximum twos-complement integer number 
0000 0000 0000 7FFF FFFF positive 

FFFF FFFF FFFF 8000 0000 negative 

Maximum unsigned fractional number 
0000 FFFF FFFF FFFF FFFF 

Maximum unsigned integer number 
0000 0000 0000 FFFF FFFF 

The result from MR saturation can be sent either to the register file or back 
to the same MR register. 



2.6.4 Floating-Point Operating Modes 

The multiplier is affected by two mode status bits in the MODE1 register: 
the rounding mode and rounding boundary bits, which affect operations 
in both the multiplier and the ALU. 

MODE1 

Bit Name Function 

15 TRUNC l=Truncation; 0=Round to nearest 

16 RND32 l=Round to 32 bits; 0=Round to 40 bits 

(RND32 must be set to 1 on ADSP-21010) 

2.6 .4. 1 Floating-Point Rounding Modes 

The multiplier supports two IEEE rounding modes for floating-point 
operations. If the TRUNC bit is set, the multiplier rounds a floating-point 
result to zero (truncation). If the TRUNC bit is cleared, the multiplier 
rounds to nearest. 

2.6A.2 Floating-Point Rounding Boundary 

Floating-point multiplier inputs and results can be either 32-bit or 40-bit 
floating-point data on the ADSP-21020. If the RND32 bit is set, the eight 
LSBs of each input operand are flushed to zeros before multiplication, and 
floating-point results are output in the 32-bit IEEE format, with the lower 
eight bits of the 40-bit register file location cleared. The mantissa of the 
result is rounded to 23 bits (not including the hidden bit). If the RND32 bit 
is cleared, the multiplier inputs full 40-bit values from the register file and 
outputs results in the 40-bit extended IEEE format, with the mantissa 
rounded to 31 bits not including the hidden bit. 

2.6.5 Multiplier Status Flags 

The multiplier updates four status flags at the end of each operation. All 
of these flags appear in the AST AT register. The states of these flags reflect 
the result of the most recent multiplier operation. The multiplier also 
updates four "sticky" status flags in the STKY register. Once set, a sticky 
flag remains high until explicitly cleared. 


ASTAT 


Bit 

Name 

Definition 

6 

MN 

Multiplier result negative 

7 

MV 

Multiplier overflow 

8 

MU 

Multiplier underflow 

9 

MI 

Multiplier floating-point invalid operation 

STKY 

Bit 

Name 

Definition 

6 

MOS 

Multiplier fixed-point overflow 

7 

MVS 

Multiplier floating-point overflow 

8 

MUS 

Multiplier underflow 

9 

MIS 

Multiplier floating-point invalid operation 


Flag update occurs at the end of the cycle in which the status is generated 
and is available on the next cycle. If a program writes the ASTAT register 
or STKY register explicitly in the same cycle that the multiplier is 
performing an operation, the explicit write to ASTAT or STKY supersedes 
any flag update from the multiplier operation. 

2 . 6 . 5 . 1 Multiplier Underflow Flag (MU) 

Underflow is determined for all fixed-point and floating-point multiplier 
operations. It is set whenever the result of a multiplier operation is smaller 
than the smallest number representable in the output format. It is 
otherwise cleared. 


For floating-point results, MU and MUS are set whenever the post- 
rounded result underflows (unbiased exponent < -126). Denormal 
operands are treated as Zeros, therefore they never cause underflows. 

For fixed-point results, MU and MUS depend on the data format and are 
set under the following conditions: 


Twos-complement: 

Fractional: upper 48 bits all zeros or all ones, lower 32 bits not all zeros 

Integer: not possible 


Unsigned: 

Fractional: upper 48 bits all zeros, lower 32 bits not all zeros 
Integer: not possible 


If the fixed-point result is sent to an MR register, the underflowed portion 
of the result is available in MRO (fractional result only). 



2.6.5.2 Multiplier Negative Flag (MN) 

The negative flag is determined for all multiplier operations. MN is set 
whenever the result of a multiplier operation is negative. It is otherwise 
cleared. 


2.6.5.3 Multiplier Overflow Flag (MV) 

Overflow is determined for all fixed-point and floating-point multiplier 
operations. 

For floating-point results, MV and MVS are set whenever the post- 
rounded result overflows (unbiased exponent > 127). 

For fixed-point results, MV and MOS depend on the data format and are 
set under the following conditions: 


T wos-complemen t: 

Fractional: upper 17 bits of MR not all zeros or all ones 

Integer: upper 49 bits of MR not all zeros or all ones 


Unsigned: 

Fractional: upper 16 bits of MR not all zeros 

Integer: upper 48 bits of MR not all zeros 


If the fixed-point result is sent to an MR register, the overflowed portion 
of the result is available in MR1 and MR2 (integer result) or MR2 only 
(fractional result). 


26 . 5.4 Multiplier Invalid Flag (Ml) 

The invalid flag is determined for floating-point multiplication. MI is set 
whenever: 


• an input operand is a NAN. 

• the inputs are Infinity and Zero. (Note: Denormal inputs are 
treated as Zeros.) 


MI is otherwise cleared. 


2.6.6 Multiplier Instruction Summary 

Instruction 

Fixed-point: 


Rn 

= Rx * Ry MS 1 

Is 

F 

MRF 

\u\ 

1 u 

I 

MRB 



FR 


Rn 

Rn 

MRF 

MRB 


= SAT MRF 
= SAT MRB 
= SAT MRF 
= SAT MRB 


Rn = RND MRF 
Rn = RND MRB 
MRF = RND MRF 
MRB = RND MRB 


(SI) 

(UI) 

(SF) 

(UF) 

(SF) 

(UF) 


MRF 

MRB 


- 0 


MRxF = Rn 
MRxB 


Rn 


= MRxF 
MRxB 


Floating-point: 

Fn = Fx * Fy 


Rn 

= MRF 

+ Rx * Ry 

( 

s I 

s 

F 

Rn 

= MRB 



u 

I 

MRF 

= MRF 





FR 

MRB 

= MRB 






Rn 

= MRF 

& 

* 

(2 

1 

( 

s 

s 

F 

Rn 

= MRB 


u 

u 

I 

MRF 

= MRF 





FR 

MRB 

= MRB 







AST AT Flags STKY Flags 

MU MN MV Ml MUSMOSMVS MIS 


* X * 


X- X- X 


X- st- * 


* X X 


* * * 


0 - 


0 - 


0 - 


0000 ---- 

0000 ---- 

0000 ---- 

X ■ X X X XX _ XX XX 


Note: For floating-point multiply /accumulates, see "Multifunction Instructions” on page 23. 

* set or cleared, depending on results of instruction 

** may be set (but not cleared), depending on results of instruction 

- no effect 

Rn, Rx, Ry -R15-R0; register file location, treated as fixed-point 
Fn, Fx, Fy -F15-F0; register file location, treated as floating-point 
MRxF -MR2F, MR1F, MR0F; multiplier result accumulators, foreground 
MRxB -MR2B, MR1B, MR0B; multiplier result accumulators, background 



Multiplier Instruction Summary, cont. 

Optional Modifiers for Fixed-Point: 


( □ 

□ □ ) 

S 

Signed input 

4— » 

3 

■a bo 

g (8 

U 

Unsigned input 

Oh 

a 

Cu a *3 

a E V. 

I 

Integer input(s) 

>< 

'V o S 
>h Ur: 5 

F 

Fractional input(s) 

a 2 

FR 

Fractional inputs. Rounded output 

2.7 

a 

0 

SHIFTER 

(SF) 

(SSF) 

Default format for 1 -input operations 
Default format for 2-input operations 

The 

shifter operates 

on 32-bit fixed-point operands. Shifter 

include: 




• shifts and rotates from off-scale left to off-scale right 

• bit manipulation operations, including bit set, clear, toggle, and test 

• bit field manipulation operations including extract and deposit 

• support for ADSP-2100 family compatible fixed-point/ floating-point 
conversion operations (exponent extract, number of leading Is or Os) 

2.7.1 Shifter Operation 

The shifter takes from one to three input operands: the X-input, which is 
operated upon; the Y-input, which specifies shift magnitudes, bit field 
lengths or bit positions; and the Z-input, which is operated on and 
updated (as in, for example, Rn = Rn OR LSHIFT Rx BY Ry). The shifter 
returns one output to the register file. 

Input operands are fetched from the upper 32 bits of a register file location 
(bits 39-8, as shown in Figure 2.4) or from an immediate value in the 
instruction. The operands are transferred during the first half of the cycle. 
The result is transferred to the upper 32 bits of a register (with the eight 
LSBs zero-filled) during the second half of the cycle. Thus the shifter can 
read and write the same register file location in a single cycle. 



The X-input and Z-input are always 32-bit fixed-point values. The Y-input 
is a 32-bit fixed-point value or an 8-bit field ( shf8 ), positioned in the 
register file as shown in Figure 2.4 below. 

Some shifter operations produce 8-bit or 6-bit results. These results are 
placed in either the shf8 field or the bit6 field (see Figure 2.5) and are sign- 
extended to 32 bits. Thus the shifter always returns a 32-bit result. 

39 7 0 


32-Bit Y-Input or Result 


39 15 7 0 



8-Bit Y-Input or Result 

Figure 2.4 Register File Fields for Shifter Instructions 


2.7.2 Bit Field Deposit & Extract Instructions 

The shifter's bit field deposit and bit field extract instructions allow the 
manipulation of groups of bits within a 32-bit fixed-point integer word. 

The Y-input for these instructions specifies two 6-bit values, bit6 and len6, 
positioned in the Ry register as shown in Figure 2.5. Bit6 and len6 are 
interpreted as positive integers. Bit6 is the starting bit position for the 
deposit or extract. Len6 is the bit field length, which specifies how many 
bits are deposited or extracted. 


39 19 13 7 0 



12-Bit Y-Input 

Figure 2.5 Register File Fields for FDEP, FEXT Instructions 
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The FDEP (field deposit) instructions take a group of bits from the input 
register Rx (starting at the LSB of the 32-bit integer field) and deposit them 
anywhere within the result register Rn. The bit6 value specifies the 
starting bit position for the deposit. See Figure 2.6. 

The FEXT (field extract) instructions extract a group of bits from anywhere 
within the input register Rx and place them in the result register Rn 
(aligned with the LSB of the 32-bit integer field). The bit6 value specifies 
the starting bit position for the extract. 


Rn=FDEP Rx BY Ry 



Ry determines length of bit field to take from Rx and starting bit position for deposit in Rn 


39 


Rx[ 


7 


0 



Ien6 = number of bits to take from Rx, starting from LSB of 32-bit field 


Rn| 


deposit field 


7 0 



bit6 reference point 


bit6 = starting bit position for deposit, referenced from LSB of 32-bit field 

Figure 2.6 Bit Field Deposit Instruction 
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The following field deposit instruction example is pictured in Figure 2.7: 

R0=FDEP Rl BY R2; 

R0=FDEP Rl BY R2; 

R1 =0x000000FF00 
R2=0x0000021 000 


39 32 24 16 8 0 


| 00000000 1 00000000 

oooomw\ ®mioooo[^m^\ 0 x 00000210 * 

CM 

CM 

CO 

o> 

CO 

1 | Ien6 = 8 

!er.6 biio bit6 = 16 

16 8 0 

I 00000000 1 00000000 

00000000 ImmUDIII] J oooooooo l OxOOOO ooff oq 

39 32 24 

16 8 0 

I oooooooo lnnnniinnn 

00000000 1 00000000 1 OxOOFF 0000 00 

24 16 

l 

8 0 

1 


starting bit reference 

position for point 

deposit 


8 bits are taken from Rl and deposited in R0, starting at bit 16. 
("Bit 16" is relative to reference point, the LSB of 32-bit integer field.) 

Figure 2.7 Bit Field Deposit Example 


2-22 




Computation Units 


The following field extract instruction example is pictured in Figure 2.8: 

R3=FEXT R4 BY R5; 

R3=FEXT R4 BY R5; 

R4=0x8788000000 
R5=0x0000021 700 


39 

32 

24 

16 

8 0 

| 00000000 1 

00000000 | 

oooo ! 

01 011 A oooooooo 1 




r 

Ien6 

■ T 
bit6 

39 

32 

24 

16 

8 0 

h 

1 

ooooooo 1 

00000000 | 

o 

o 

o 

o 

o 

o 

o 

o 

f§ 



16 

8 

0 


t 


t 


starting bit position 
for extract 


reference 

point 


39 32 

24 16 

8 0 

| OOOOOOOO 1 

OOOOOOOO I OOOOOOOO I 

Tooooooool 


0x0000 0217 00 

Ien6 = 8 
bit6 = 23 


0x8788 0000H1 


0x0000 OOOFiii 


8 bits are extracted from R4 and placed in R3, aligned to the LSB of the 32-bit integer field. 

Figure 2.8 Bit Field Extract Example 




2.7.3 Shifter Status Flags 

The shifter returns three status flags at the end of the operation. All of 
these flags appear in the AST AT register. The SZ flag indicates if the 
output is zero, the SV flag indicates an overflow, and the SS flag indicates 
the sign bit in exponent extract operations. 


ASTAT 



Bit 

Name 

Definition 

11 

SV 

Shifter overflow of bits to left of MSB 

12 

SZ 

Shifter result zero 

13 

SS 

Shifter input sign (for exponent extract only) 


Flag update occurs at the end of the cycle in which the status is generated 
and is available on the next cycle. If a program writes the AST AT register 
explicitly in the same cycle that the shifter is performing an operation, the 
explicit write to AST AT supersedes any flag update caused by the shift 
operation. 

2. 7.3 . 1 Shifter Zero Flag (SZ) 

SZ is affected by all shifter operations. It is set whenever: 

• the result of a shifter operation is zero, or 

• a bit test instruction specifies a bit outside of the 32-bit fixed-point 
field. 

SZ is otherwise cleared. 

2. 7.3.2 Shifter Overflow Flag (SV) 

SV is affected by all shifter operations. It is set whenever: 

• significant bits are shifted to the left of the 32-bit fixed-point field, 

• a bit outside of the 32-bit fixed-point field is tested, set or cleared, 

• a field that is partially or wholly to the left of the 32-bit fixed-point 
field is extracted, or 

• a LEFTZ or LEFTO operation returns a result of 32. 

SV is otherwise cleared. 

21.3.3 Shifter Sign Flag (SS) 

SS is affected by all shifter operations. For the two EXP (exponent 
extract) operations, it is set if the fixed-point input operand is negative 
and cleared if it is positive. For all other shifter operations, SS is 
cleared. 




2.7.4 Shifter Instruction Summary 

Instruction 

c Rn = LSHIFT Rx BY Ry 
c Rn = LSHIFT Rx BY <data8> 
c Rn = Rn OR LSHIFT Rx BY Ry 
c Rn = Rn OR LSHIFT Rx BY <data8> 
c Rn = ASHIFT Rx BY Ry 
c Rn = ASHIFT Rx BY<data8> 
c Rn = Rn OR ASHIFT Rx BY Ry 
c Rn = Rn OR ASHIFT Rx BY <data8> 

Rn = ROT Rx BY RY 
Rn = ROT Rx BY <data8> 

Rn = BCLR Rx BY Ry 
Rn = BCLR Rx BY <data8> 

Rn = BSET Rx BY Ry 
Rn = BSET Rx BY <data8> 

Rn = BTGL Rx BY Ry 
Rn = BTGL Rx BY <data8> 

BTST Rx BY Ry 
BTST Rx BY <data8> 

Rn = FDEP Rx BY Ry 

Rn = FDEP Rx BY <bit6>:<len6> 

Rn = Rn OR FDEP Rx BY Ry 

Rn = Rn OR FDEP Rx BY <bit6>:<len6> 

Rn = FDEP Rx BY Ry (SE) 

Rn = FDEP Rx BY <bit6>:<len6> (SE) 

Rn = Rn OR FDEP Rx BY Ry (SE) 

Rn = Rn OR FDEP Rx BY <bit6>:<len6> (SE) 

Rn = FEXT Rx BY Ry 

Rn = FEXT Rx BY <bit6>:<len6> 

Rn = FEXT Rx BY Ry (SE) 

Rn = FEXT Rx BY <bit6>:<len6> (SE) 
c Rn = EXP Rx (EX) 
c Rn = EXP Rx 
Rn = LEFTZ Rx 
Rn = LEFTO Rx 


Flags 

SZ sv 

* * 

* * 

* * 

* * 

* * 

* * 

* 0 

* 0 



* = Depends on data 

Rn, Rx, Ry = Any register file location; bit fields used depend on instruction 
c = ADSP-21 00-compatible instruction 


SS 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

O' 

0 


0 

0 
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2.8 MULTIFUNCTION COMPUTATIONS 

In addition to the computations performed by each computation unit, the 
ADSP-21020 also provides multifunction computations that combine 
parallel operation of the multiplier and the ALU, or dual functions in the 
ALU. The two operations are performed in the same way as they are in 
corresponding single-function computations. Flags are also determined in 
the same way as for the same single-function computations, except that in 
the dual add /subtract computation the ALU flags from the two 
operations are ORed together. 

Each of the four input operands for computations that use both the ALU 
and multiplier are constrained to a different set of four register file 
locations, as summarized below. For example, the X-input to the 
multiplier can only be R8, R9, RIO or Rll. In all other operations, the input 
operands may be any register file locations. 


Dual Add/Subtract 

Ra = Rx + Ry , Rs = Rx - Ry 
Fa = Fx + Fy , Fs = Fx - Fy 


Fixed-Point Multiply/Accumulate and Add, Subtract or Average 


Rm=R3-0 * R7-4 (SSFR) 
MRF=MRF + R3-0 * R7-4 (SSF) , 
Rm=MRF + R3-0 * R7-4 (SSFR) , 
MRF=MRF - R3-0 * R7-4 (SSF) , 
Rm=MRF - R3-0 * R7-4 (SSFR) , 


Ra=Rll-8 + R15-12 
Ra=Rll-8 - R15-12 
Ra=(Rll-8 + R15-12)/2 


Floating-Point Multiplication and ALU Operation 


Fm=F3-0 * F7-4 , 


Fa=Fll-8 + FI 5-12 
Fa=Fll-8 - F15-12 
Fa=FLOAT Rll-8 by R15-12 
Fa=FIX Rll-8 by R15-12 
Fa=(Fll-8 + F15-12)/2 
Fa=ABS FI 1-8 
Fa=MAX (FI 1-8, F15-12) 
Fa=MIN (FI 1-8, F15-12) 


Multiplication and Dual Add/Subtract 

Rm = R3-0 * R7-4 (SSFR) , Ra = Rll-8 + R15-12 , Rs = Rll-8 - R15-12 
Fm = F3-0 * F7-4 , Fa = FI 1-8 + F15-12 , Fs = FI 1-8 - F15-12 
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Rm, Ra, Rs, Rx, Ry -Any register file location; fixed-point 

Fm, Fa, Fs, Fx, Fy -Any register file location; floating-point 

R3-0 -R3, R2, Rl, RO F3-0 -F3, F2, FI, FO 

R7-4 -R7, R6, R5, R4 F7-4 -F7, F6, F5, F4 

Rll-8 -Rll, RIO, R9, R8 Fll-8 -Fll, F10, F9, F8 

R15-12 -R15, R14, R13, R12 F15-12 -F15, F14, F13, F12 

SSFR -X-input signed, Y-input signed. Fractional input, Rounded-to-nearest output 
SSF -X-input signed, Y-input signed. Fractional input 



2.9 REGISTER FILE 

The register file provides the interface between the main processor buses 
(DMD and PMD) and the computation units. It also provides local storage 
for operands and results. The register file consists of 16 primary registers 
and 16 alternate (secondary) registers. All registers are 40 bits wide on the 
ADSP-21020 and 32 bits wide on the ADSP-21010. On the ADSP-21020, 
32-bit data from the computation units is always left-justified; on register 
reads, the eight LSBs are ignored, and on writes, the eight LSBs are written 
with zeros. 

Program memory accesses and data memory accesses to the register file 
occur on the PMD and DMD buses, respectively. One program memory 
and/ or one data memory access can occur in one cycle. Transfers between 
the register file and the 40T>it DMD bus are always 40 bits wide on the 
ADSP-21020. The register file transfers data to and from the 48-bit PMD 
bus on the most significant 40 bits, writing zeros in the lower eight bits on 
transfers to the PMD bus. 

If the same register file location is specified as both the source of an 
operand and the destination of a result or memory fetch, the read occurs 
in the first half of the cycle and the write in the second half. Thus the old 
data is used as the operand before the location is updated with the new 
result data. If writes to the same location take place in the same cycle, only 
the write with higher precedence actually occurs. Precedence is 
determined by the source of the data being written; from highest to 
lowest, the precedence is: 

• Data memory or universal register 

• Program memory 

• ALU 

• Multiplier 

• Shifter 

The individual registers of the register file are prefixed with an "f" when 
used in floating-point computations (in assembly language source code). 
The registers are prefixed with an "r" when used in fixed-point 
computations. The following instructions, for example, use the same three 
registers: 

F0=F1 * F2 ; floating-point multiply 

R0=R1 * R2; fixed-point multiply 

The "f" and "r" prefixes do not affect the 40-bit (or 32-bit) data transfer; 
they only determine how the ALU, multiplier, or shifter treat the data. 



2.9.1 Alternate (Secondary) Registers 

To facilitate fast context switching, the register file has an alternate register 
set. Each half of the register file — the lower half, RO through R7, and the 
upper half, R8 through R15 — can independently activate its alternate 
register set. Two bits in the MODE1 register select the active sets. Data can 
be shared between contexts by placing the data to be shared in one half of 
the register file and activating the alternate register set of the other half. 


MODE1 
Bit Name 

7 SRRFH 

10 SRRFL 


Definition 

Register file alternate select for R15-R8 (F15-F8) 
Register file alternate select for R7-R0 (F7-F0) 



Program Sequencing 


3.1 OVERVIEW 

Program flow in the ADSP-21020/21010 is most often linear; the processor 
executes program instructions sequentially. Variations in this linear flow 
are provided by the following program structures, illustrated in 
Figure 3.1 on the following page: 

• Loops . One sequence of instructions is executed several times with zero 
overhead. 

• Subroutines. The processor temporarily interrupts sequential flow to 
execute instructions from another part of program memory. 

• Jumps. Program flow is permanently transferred to another part of 
program memory. 

• Interrupts. A special case of subroutines in which the execution of the 
routine is triggered by an event that happens at run time, not by a 
program instruction. 

• Idle. A special instruction that causes the processor to cease operations, 
holding its current state. When an interrupt occurs, the processor 
services the interrupt and continues normal execution. 

Managing these program structures is the job of the ADSP-21020/21010 / s 
program sequencer. The program sequencer selects the address of the next 
instruction, generating most of those addresses itself. It also perfprms a 
wide range of related functions, such as 

• incrementing the fetch address, 

• maintaining stacks, 

• evaluating conditions, 

• decrementing the loop counter, 

• calculating new addresses, 

• maintaining an instruction cache, and 

• handling interrupts. 



Address: n 
n+1 
n+2 
n+3 
n+4 
n+5 


Instruction 


Instruction 


Instruction 


Instruction 


Instruction 


Instruction 



Linear Flow 


Loop 


Jump 



Subroutine Interrupt Idle 

Figure 3.1 Program Flow Variations 


3.1.1 Instruction Cycle 

The ADSP-21020 processes instructions in three clock cycles: 

• In the fetch cycle, the ADSP-21020 reads the instruction from either the 
internal instruction cache or program memory. 

• During the decode cycle, the instruction is decoded, generating 
conditions that control instruction execution. 

• In the execute cycle, the ADSP-21020 executes the instruction; the 
operations specified by the instruction are completed. 



These cycles are overlapping, or pipelined, as shown in Figure 3.2. In 
sequential program flow, when one instruction is being fetched, the 
instruction fetched in the previous cycle is being decoded, and the 
instruction fetched two cycles before is being executed. Thus, the 
throughput is one instruction per cycle. 


time 

(cycles) 

I i 


2 


3 


4 


5 


y i I 1 

Figure 3.2 Pipelined Execution Cycles 

Any non-sequential program flow can potentially decrease the 
ADSP-21 020's instruction throughput. Non-sequential program 
operations include: 

• Program memory data accesses that conflict with instruction fetches 

• Jumps 

• Subroutine Calls and Returns 

• Interrupts and Returns 

• Loops 

3.1 .2 Program Sequencer Architecture 

Figure 3.3, on the next page, shows a block diagram of the program 
sequencer. The sequencer selects the value of the next fetch address from 
several possible sources. 

The fetch address register, decode address register and program counter 
(PC) contain, respectively, the addresses of the instructions currently 
being fetched, decoded and executed. The PC is coupled with the PC 
stack, which is used to store return addresses and top-of-loop addresses. 


Fetch 

Decode 

Execute 

0x08 



0x09 

0x08 


OxOA 

0x09 

0x08 

OxOB 

OxOA 

0x09 

OxOC 

OxOB 

OxOA 




LOOP LOGIC 



PMA BUS 

Figure 3.3 Program Sequencer Block Diagram 


The interrupt controller performs all functions related to interrupt 
processing, such as determining whether an interrupt is masked and 
outputting the appropriate interrupt vector. 

The instruction cache provides a means by which the ADSP-21020 can 
access data in program memory and fetch an instruction in the same cycle. 
The DAG2 data address generator (described in Chapter 4) outputs 
program memory data addresses. 

The sequencer evaluates conditional instructions and loop termination 
conditions using information from the status registers. The loop address 
stack and loop counter stack support nested loops. The status stack stores 
status registers for implementing nested external interrupt routines. 
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3. 1.2. 1 Program Sequencer Registers & System Registers 

Table 3.1 lists the registers located in the program sequencer. The 
functions of these registers are described in subsequent sections of this 
chapter. All registers in the program sequencer are universal registers and 
are thus accessible to other universal registers as well as external data 
memory. All registers and the tops of stacks are readable; all registers 
except the fetch address, decode address and PC are writeable. The PC 
stack can be pushed and popped by writing the PC stack pointer, which is 
readable and writeable. The loop address stack and status stack are 
pushed and popped by explicit instructions. 

The system register bit manipulation instruction can be used to set, clear, 
toggle or test specific bits in the system registers. This instruction is 
described in Appendix A, Group IV-Miscellaneous instructions. 

Due to pipelining, writes to some of these registers do not take effect on 
the next cycle; for example, if you write the MODE1 register to enable 
ALU saturation mode, the change will not occur until two cycles after the 
write. Also, some registers are not updated on the cycle immediately 
following a write; it takes an extra cycle before a read of the register yields 
the new value. Table 3.1 summarizes the number of extra cycles for a write 
to take effect (effect latency) and for a new value to appear in the register 
(read latency). A "0" indicates that the write takes effect or appears in the 
register on the next cycle after the write instruction is executed. A "1" 
indicates one extra cycle. 

Program Sequencer Read Effect 


Registers 

Contents 

Bits 

latency 

latency 

FADDR* 

fetch address 

24 

- 

- 

DADDR* 

decode address 

24 

- 

- 

PC* 

execute address 

24 

- 

- 

PCSTK 

top of PC stack 

24 

0 

0 

PCSTKP 

PC stack pointer 

5 

1 

1 

LADDR 

top of loop address stack 

32 

0 

0 

CURLCNTR 

top of loop count stack (current loop count) 

32 

0 

0 

LCNTR 

System Registers 

loop count for next DO UNTIL loop 

32 

0 

0 

MODE1 

mode control bits 

32 

0 

1 

MODE2 

mode control bits 

32 

0 

1 

IRPTL 

interrupt latch 

32 

0 

0 

IMASK 

interrupt mask 

32 

0 

1 

IMASKP 

interrupt mask pointer (for nesting) 

32 

1 

1 

ASTAT 

arithmetic status flags 

32 

0 

1 

STKY 

sticky status flags 

32 

0 

1 

USTAT1 

user-defined status flags 

32 

0 

0 

USTAT2 

* read-only 

user-defined status flags 

32 

0 

0 

Table 3.1 Program Sequencer Registers & System Registers 
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3.2 PROGRAM SEQUENCER OPERATIONS 

This section is an overview of the operation of the program sequencer. The 
various kinds of program flow are defined here and described in detail in 
subsequent sections. 

3.2.1 Sequential Instruction Flow 

The program sequencer determines the next instruction address by 
examining both the current instruction being executed and the current 
state of the processor. If no conditions require otherwise, the ADSP-21020 
executes instructions from program memory in sequential order by simply 
incrementing the fetch address. 

3.2.2 Program Memory Data Access 

Usually, the ADSP-21020 fetches an instruction from program memory on 
each cycle. When the ADSP-21020 executes an instruction which requires 
data to be read from or written to program memory, there is a conflict for 
that memory space. The ADSP-21020 has an instruction cache to reduce 
delays caused by this type of conflict. 

The first time that the ADSP-21020 encounters an instruction fetch that 
conflicts with a program memory data access, it must fetch the instruction 
on the following cycle, causing a delay. The ADSP-21020 automatically 
writes the fetched instruction to the cache to avoid the overhead should 
the same instruction fetch occur again. The ADSP-21020 checks the 
instruction cache on every program memory data access. If the instruction 
needed is in the cache, the instruction fetch from the cache happens in 
parallel with the program memory data access, without incurring a delay. 

3.2.3 Branches 

A branch occurs when the fetch address is not the next sequential address 
following the previous fetch address. Jumps, calls and returns are the 
types of branches which the ADSP-21020 supports. In the program 
sequencer, the only difference between a jump and a call is that upon 
execution of a call, a return address is pushed onto the PC stack so that it 
is available when a return instruction is later executed. Jumps branch to a 
new location without allowing return. 

3.2.4 Loops 

The ADSP-21020 supports loop instructions through the DO UNTIL 
instruction. The DO UNTIL instruction causes the ADSP-21020 to repeat a 
sequence of instructions until a specified condition tests true. 



3.3 CONDITIONAL INSTRUCTION EXECUTION 

The program sequencer evaluates conditions to determine whether to 
execute a conditional instruction and when to terminate a loop. The 
conditions are based on information from the arithmetic status (AST AT) 
register, mode control 1 (MODE1) register, flag inputs and loop counter. 
The arithmetic AST AT bits are described in Chapter 2 in the description of 
each computation unit. 

Each condition that the ADSP-21020 evaluates has an assembler 
mnemonic and a unique code (number) that is used in a conditional 
instruction's opcode. For most conditions, the program sequencer can test 
both true and false states, e.g., equal to zero and not equal to zero. 

Table 3.2, on the following page, defines the 32 status conditions. 

The bit test flag (BTF) is bit 18 of the AST AT register. This flag is set (or 
cleared) by the results of the BIT TST and BIT XOR forms of the 
System Register Bit Manipulation instruction, which can be used to test the 
contents of the ADSP-21020's system registers. This instruction is 
described in Appendix A, Group IV-Miscellaneous instructions. After BTF 
is set by this instruction, it can be used as the condition in a conditional 
instruction (with the mnemonic TF; see Table 3.2). 

The two conditions that do not have complements are LCE/NOT LCE 
(loop counter expired /not expired) and TRUE /FOREVER. The 
interpretation of these condition codes is determined by context; TRUE 
and NOT LCE are used in conditional instructions, FOREVER and LCE in 
loop termination. The IF TRUE construct creates an unconditional 
instruction (the same effect as leaving out the condition entirely). A DO 
FOREVER instruction executes a loop indefinitely, until an interrupt or 
reset intervenes. 

Because the LCE condition checks the value of the loop counter 
(CURLCNTR), an IF NOT LCE conditional instruction should not follow a 
write to CURLCNTR from memory. Otherwise, because the write occurs 
after the NOT LCE test, the condition is based on the old CURLCNTR 
value. 



No. 

Mnemonic 

Description 

True If 

0 

EQ 

ALU equal zero 

AZ = 1 

1 

LT 

ALU less than zero 

[AF and (AN xor (AV and ALUSAT)) 
or (AF and AN and AZ)] = 1 

2 

LE 

ALU less than or equal zero 

[AF and (AN xor (AV and ALUSAT)) 
or (AF and AN) ] or AZ = 1 

3 

AC 

ALU carry 

AC = 1 

4 

AV 

ALU overflow 

AV = 1 

5 

MV 

Multiplier overflow 

MV = 1 

6 

MS 

Multiplier sign 

MN= 1 

7 

sv 

Shifter overflow 

SV = 1 

8 

sz 

Shifter zero 

SZ = 1 

9 

FLAGOJN 

Flag 0 input 

FIO = 1 

10 

FLAGl IN 

Flag 1 input 

FI1 = 1 

11 

FLAG2 IN 

Flag 2 input 

FI2 = 1 

12 

FLAG3 IN 

Flag 3 input 

FI3 = 1 

13 

TF 

Bit test flag 

BTF = 1 

14 


Reserved 


15 

LCE 

Loop counter expired 
(DO UNTIL term) 

CURLCNTR = 1 

15 

NOT LCE 

Loop counter not expired 
(IF cond) 

CURLCNTR * 1 


Bits 16-30 are the complements of bits 0-14 


16 

NE 

ALU not equal to zero 

> 

N 

ll 

o 

17 

GE 

ALU greater than or equal zero 

[AF and (AN xor (AV and ALUSAT)) 
or (AF and AN and AZ)] = 0 

18 

GT 

ALU greater than zero 

[AF and (AN xor (AV and ALUSAT)) 
or (AF and AN)] or AZ = 0 

19 

NOT AC 

Not ALU carry 

AC = 0 

20 

NOT AV 

Not ALU overflow 

AV = 0 

21 

NOT MV 

Not multiplier overflow 

MV = 0 

22 

NOT MS 

Not multiplier sign 

MN = 0 

23 

NOT SV 

Not shifter overflow 

sv = o 

24 

NOT SZ 

Not shifter zero 

SZ = 0 

25 

NOT FLAGOJN 

Not Flag 0 input 

o 

ii 

o 

26 

NOT FLAGl JN 

Not Flag 1 input 

FI1 = 0 

27 

NOT FLAG2 IN 

Not Flag 2 input 

FI2 = 0 

28 

NOT FLAG3JN 

Not Flag 3 input 

FI3 = 0 

29 

NOT TF 

Not bit test flag 

BTF = 0 

30 


Reserved 


31 

FOREVER 

Always False (DO UNTIL) 

always 

31 

TRUE 

Always True (IF) 

always 


Table 3.2 Condition Codes 
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3.4 BRANCHES (CALL, JUMP, RTS, RTI) 

The CALL instruction initiates a subroutine. Both jumps and calls transfer 
program flow to another part of program memory, but a call also pushes a 
return address onto the PC stack so that it is available when a return from 
subroutine instruction is later executed. Jumps branch to a new location 
without allowing return. 

A return causes the processor to branch to the address stored at the top of 
the PC stack. There are two types of returns: return from subroutine (RTS) 
and return from interrupt (RTI). The difference between the two is that the 
RTI instruction not only pops the return address off the PC stack but also 
pops the status stack if status registers (AST AT and MODE1) have been 
pushed as a result of an external interrupt. 

There are a number of parameters you can specify for branches: 

• Jumps, calls and returns can be conditional. The program sequencer 
can evaluate any one of several status conditions to decide whether the 
branch should be taken. If no condition is specified, the branch is 
always taken. 

• Jumps and calls can be indirect, direct, or PC-relative. An indirect 
branch goes to an address supplied by one of the data address 
generators, DAG2. Direct branches go to the 24-bit address specified in 
an immediate field in the branch instruction. PC-relative branches also 
use a value specified in the instruction, but the sequencer adds this 
value to the current PC value to compute the address. 

• Jumps, calls and returns can be delayed or nondelayed. In a delayed 
branch, the two instructions immediately after the branch instruction 
are executed; in a nondelayed branch, the program sequencer 
suppresses the execution of those two instructions (no-operations are 
performed instead). 

• The JUMP (LA) instruction causes an automatic loop abort if it occurs 
inside a loop. When the loop is aborted, the PC and loop address 
stacks are popped once, so that if the loop was nested, the stacks still 
contain the correct values for the outer loop. (This is similar to the break 
instruction of the C programming language used to prematurely 
terminate execution of a loop.) 


3.4.1 Delayed And Nondelayed Branches 

An instruction modifier (DB) indicates that a branch is delayed; otherwise, 
it is nondelayed. If the branch is nondelayed, the two instructions after the 
branch, which are in the fetch and decode stages, are not executed (see 
Figure 3.4); for a call, the decode address (the address of the instruction 
after the call) is the return address. During the two no-operation cycles, 
the first instruction at the branch address is fetched and decoded. 
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Figure 3.4 Nondelayed Branches 






In a delayed branch, the processor continues to execute two more 
instructions while the instruction at the branch address is fetched and 
decoded (see Figure 3.5); in the case of a call, the return address is the 
third address after the branch instruction. A delayed branch is more 
efficient, but it makes the code harder to understand because of the 
instructions between the branch instruction and the actual branch. 
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Figure 3.5 Delayed Branches 















Because of the instruction pipeline, a delayed branch instruction and the 
two instructions that follow it in program memory must be executed 
sequentially. Instructions in the two program memory locations 
immediately following a delayed branch instruction can not be any of the 
following: 

• Other Jumps, Calls or Returns 

• Pushes or Pops of the PC stack 

• Writes to the PC stack or PC stack pointer 

• DO UNTIL instruction 

• IDLE instruction 

These exceptions are checked by the ADSP-21020 assembler. 

The ADSP-21020 does not process an interrupt in between a delayed 
branch instruction and either of the two instructions that follow, since 
these three instructions must be executed sequentially. Any interrupt that 
occurs during these instructions is latched but not processed until the 
branch is complete. 

A read of the PC stack or PC stack pointer immediately after a delayed call 
or return is permitted, but it will show that the return address on the PC 
stack has already been pushed or popped, even though the branch has not 
occurred yet. 

3.4.2 PC Stack 

The PC stack holds return addresses for subroutines and interrupt service 
routines and top-of-loop addresses for loops. The PC stack is 20 deep by 
24 bits wide. 

The PC stack is popped during returns from interrupts (RTI), returns from 
subroutines (RTS) and terminations of loops. The stack is full when all 
entries are occupied, empty when no entries are occupied, and overflowed 
if a call occurs when the stack is already full. The full and empty flags are 
stored in the sticky status register (STKY). The full flag causes a maskable 
interrupt. 

A PC stack interrupt occurs when 19 locations of the PC stack are filled 
(the almost full state). Entering the interrupt service routine then 
immediately causes a push on the PC stack, making it full. Thus the 
interrupt is a full interrupt, even though the condition which triggers it is 
the almost full condition. The other stacks in the sequencer, the loop 
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address stack, loop counter stack and status stack, are provided with 
overflow interrupts that are activated when a push occurs while the stack 
is in a full state. 


3 



The program counter stack pointer (PCSTKP) is a readable and writeable 
register that contains the address of the top of the PC stack. The value of 
PCSTKP is zero when the PC stack is empty, 1, 2, ..., 20 when the stack 
contains data, and 31 when the stack is overflowed. A write to PCSTKP 
takes effect after a one-cycle delay. If the PC stack is overflowed, a write to 
PCSTKP has no effect. 


3.5 LOOPS 

The DO UNTIL instruction provides for efficient software loops, without 
the overhead of additional instructions to branch, test a condition, or 
decrement a counter. Here is a simple example of an ADSP-21020 loop: 

LCNTR=30 , DO label UNTIL LCE; 

R0=DM(I0,M0) , F2=PM(I8,M8) ; 

R1=R0-R15; 
label: F4=F2+F3; 

Chapter 8 contains more examples of DO UNTIL loops. 

When the ADSP-21020 executes a DO UNTIL instruction, the program 
sequencer pushes the address of the last loop instruction and the 
termination condition for exiting the loop (both specified in the 
instruction) onto the loop address stack. It also pushes the top-of-loop 
address, which is the address of the instruction following the DO UNTIL 
instruction, on the PC stack. 

Because of the instruction pipeline (fetch, decode and execute cycles), the 
processor tests the termination condition (and, if the loop is counter- 
based, decrements the counter) before the end of the loop so that the next 
fetch either exits the loop or returns to the top based on the test condition. 
Specifically, the condition is tested when the instruction two locations 
before the last instruction in the loop (at location e-2, where e is the end- 
of-loop address) is executed. If the termination condition is not satisfied, 
the processor fetches the instruction from the top-of-loop address stored 
on the top of the PC stack. If the termination condition is true, the 
sequencer fetches the next instruction after the end of the loop and pops 
the loop stack and PC stack. Loop operation is shown in Figure 3.6, on the 
next page. 
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CLOCK CYCLES ► 
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Figure 3.6 Loop Operation 

3.5.1 Restrictions And Short Loops 

This section describes several programming restrictions for loops. It also 
explains restrictions applying to short (one- and two-instruction) loops, 
which require special consideration because of the three-instruction fetch- 
decode-execute pipeline. 

3.5. 1 . 1 General Restrictions 

The last three instructions of a loop cannot be any branch except a jump 
with loop abort (LA); otherwise, the loop may not be executed correctly. 

Nested loops cannot terminate on the same instruction. 





3 . 5 . 1.2 Counter-Based Loops 

The third-to-last instruction of a counter-based loop (at e-2, where e is the 
end-of-loop address) cannot be a write to the counter from external 
memory. 

Short loops terminate in a special way because of the instruction (fetch- 
decode-execute) pipeline. Counter-based loops of one or two instructions 
are not long enough for the sequencer to check the termination condition 
two instructions from the end of the loop. In these short loops, the 
sequencer has already looped back when the termination condition is 
tested. The sequencer provides special handling to avoid overhead (no- 
operation) cycles if the loop is iterated a minimum number of times. The 
detailed operation is shown in Figures 3.7 and 3.8 (on the following page). 
For no overhead, a loop of length one must be executed at least three 
times and a loop of length two must be executed at least twice. 


ONE-INSTRUCTION LOOP, THREE ITERATIONS 


CLOCK CYCLES ► 


Execute 

Instruction 

n 

n+1 

first iteration 

n+1 

second iteration 

n+1 

third iteration 

n+2 

Decode 

Instruction 

n+1 

n+1 

n+1 

n+2 

n+3 

Fetch 

Instruction 

n+2 

n+1 

n+2 

n+3 

n+4 


LCNTR <- 3 opcode latch not loop-back aborts; 

updated; fetch PC & loop stacks 

address not popped 

updated; count 
expired tests true 


ONE-INSTRUCTION LOOP, TWO ITERATIONS (Two Cycles of Overhead) 
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Loops of length one that iterate only once or twice and loops of length two 
that iterate only once incur two cycles of overhead because there are two 
aborted instructions after the last iteration to clear the instruction pipeline. 
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Figure 3.8 Two-Instruction Loops 

Processing of an interrupt that occurs during the last iteration of a one- 
instruction loop that executes once or twice, a two-instruction loop that 
executes once, or the cycle following one of these loops (which is a no- 
operation) is delayed by one cycle. Similarly, in a one-instruction loop that 
iterates at least three times, processing is delayed by one cycle if the 
interrupt occurs during the third-to-last iteration. 




3.5. 1.3 Non-Counter-Based Loops 

A non-counter-based loop is one in which the loop termination condition 
is something other than LCE. When a non-counter-based loop is the outer 
loop of a series of nested loops, the end address of the outer loop must be 
located at least two addresses after the end address of the inner loop. 

The JUMP (LA) instruction is used to prematurely abort execution of a 
loop. When this instruction is located in the inner loop of a series of nested 
loops and the outer loop is non-counter-based, the address jumped to 
cannot be the last instruction of the outer loop. The address jumped to 
may, however, be the next-to-last instruction (or any earlier). 

Non-counter-based short loops terminate in a special way because of the 
fetch-decode-execute instruction pipeline: 

• In a three-instruction loop, the termination condition is tested when 
the top of loop instruction is executed. When the condition becomes 
true, the sequencer completes one full pass of the loop before exiting. 

• In a two-instruction loop, the termination condition is checked during 
the last (second) instruction. If the condition becomes true when the 
first instruction is executed, it tests true during the second and one 
more full pass is completed before exiting. If the condition becomes 
true during the second instruction, however, two more full passes 
occur before the loop exit. 

• In a one-instruction loop, the termination condition is checked every 
cycle. When the condition becomes true, the loop executes three more 
times before exiting. 

3.5.2 Loop Address Stack 

The loop address stack is six levels deep by 32 bits wide. The 32-bit word 
of each level consists of a 24-bit loop termination address, a 5-bit 
termination code, and a 2-bit loop type code: 

Bits Value 

0-23 Loop termination address 

24-28 Termination code 
29 reserved (always reads 0) 

30-31 Loop type code: 

00 arithmetic condition-based (not LCE) 

01 counter-based, length 1 

10 counter-based, length 2 

1 1 counter-based, length > 2 



rogram 




The loop termination address, termination code and loop type code are 
stacked when a DO UNTIL or PUSH LOOP instruction is executed. The 
stack is popped two instructions before the end of the last loop iteration or 
when a POP LOOP instruction is issued. A stack overflows if a push 
occurs when all entries in the loop stack are occupied. The stack is empty 
when no entries are occupied. The overflow and empty flags are in the 
sticky status register (STKY). Overflow causes a maskable interrupt. 

The LADDR register contains the top of the loop address stack. It is 
readable and writeable over the DMD bus. Reading and writing LADDR 
does not move the loop address stack pointer; a stack push or pop, 
performed with explicit instructions, moves the stack pointer. LADDR 
contains the value OxFFFF FFFF when the loop address stack is empty. 

Because the termination condition is checked two instructions before the 
end of the loop, the loop stack is popped before the end of the loop on the 
final iteration. If LADDR is read at either of these instructions, the value 
will no longer be the termination address for the loop. 

A jump out of a loop pops the loop address stack (and the loop count 
stack if the loop is counter-based) if the Loop Abort option is specified for 
the jump. This allows the loop mechanism to continue to function 
correctly. Only one pop is performed, however, so the Loop Abort cannot 
be used to jump more than one level of loop nesting. 

3.5.3 Loop Counters And Stack 

The loop counter stack is six levels deep by 32 bits wide. The loop counter 
stack works in synchronization with the loop address stack; both stacks 
always have the same number of locations occupied. Thus, the same 
empty and overflow status flags apply to both stacks. 

The ADSP-21020 program sequencer operates two separate loop counters: 
the current loop counter (CURLCNTR), which tracks iterations for a loop 
being executed, and the loop counter (LCNTR), which holds the count 
value before the loop is executed. Two counters are needed to maintain 
the count for an outer loop while setting up the count for an inner loop. 

3.5.3 . 1 CURLCNTR 

The top entry in the loop counter stack always contains the loop count 
currently in effect. This entry is the CURLCNTR register, which is 
readable and writeable over the DMD bus. A read of CURLCNTR when 
the loop counter stack is empty gives the value OxFFFF FFFF. 
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The program sequencer decrements the value of CURLCNTR for each 
loop iteration. Because the termination condition is checked two 
instruction cycles before the end of the loop, the loop counter is also 
decremented before the end of the loop. If CURLCNTR is read at either of 
the last two loop instructions, therefore, the value is already the count for 
the next iteration. 

The loop counter stack is popped two instructions before the end of the 
last loop iteration. When the loop counter stack is popped, the new top 
entry of the stack becomes the CURLCNTR value, the count in effect for 
the executing loop. If there is no executing loop, the value of CURLCNTR 
is OxFFFF FFFF after the pop. 

Writing CURLCNTR does not cause a stack push. Thus, if you write a new 
value to CURLCNTR, you change the count value of the loop currently 
executing. A write to CURLCNTR when no DO UNTIL LCE loop is 
executing has no effect. 

Because the processor must use CURLCNTR to perform counter-based 
loops, there are some restrictions on when you can write CURLCNTR. As 
mentioned under "Loop Restrictions," the third-to-last instruction of a DO 
UNTIL LCE loop cannot be a write to CURLCNTR from external memory. 
The instruction that follows a write to CURLCNTR from memory cannot 
be an IF NOT LCE instruction. 

3.5.32 LCNTR 

LCNTR is the value of the top of the loop counter stack plus one, i.e., it is 
the location on the stack which will take effect on the next loop stack push. 
To set up a count value for a nested loop without affecting the count value 
of the loop currently executing, you write the count value to LCNTR. A 
value of zero in LCNTR causes a loop to execute 2 32 times. 

The DO UNTIL LCE instruction pushes the value of LCNTR on the loop 
count stack, so that it becomes the new CURLCNTR value. This process is 
illustrated in Figure 3.9, on the next page. The previous CURLCNTR value 
is preserved one location down in the stack. 

A read of LCNTR when the loop counter stack is full results in invalid 
data. When the loop counter stack is full, any data written to LCNTR is 
discarded. 

If you read LCNTR during the last two instructions of a terminating loop, 
its value is the last CURLCNTR value for the loop. 
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Figure 3.9 Pushing the Loop Counter Stack for Nested Loops 


3.6 INTERRUPTS 

An interrupt is caused by an external device asserting one of the ADSP- 
21020's interrupt inputs, by an internal exception such as a stack overflow, 
or by a user-defined software interrupt. An interrupt forces a a call to a 
predefined address, the interrupt vector. The ADSP-21020 assigns a 
unique vector to each type of interrupt it recognizes. 

Externally, the ADSP-21020 supports four prioritized, individually 
maskable interrupts IRQ3-0, each of which can be either level or edge- 
triggered. Among the internal interrupts are arithmetic, format, stack and 
timer interrupts, and reset. 

An interrupt request is deemed valid if it is not masked, if interrupts are 
globally enabled (bit 12 in MODE1 is set), and if a higher priority request 
is not pending. Valid requests invoke an interrupt service sequence that 
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branches to the address reserved for that interrupt. Interrupt vectors are 
spaced at 8-instruction intervals; longer service routines can be 
accommodated by branching to another area of the memory space. 
Execution returns to normal sequencing when a RTI (return from 
interrupt) instruction is executed. 

To process an interrupt, the program sequencer performs the following 
actions: 

1 . Outputs the appropriate interrupt vector on the program memory 
address. 

2. Pushes the current PC value (return address) on the PC stack. 

3. If the interrupt is either an external interrupt (IRQ3-0) or the internal 
timer interrupt, the program sequencer pushes the current AST AT and 
MODE1 registers on the status stack. 

4. Alters the interrupt mask pointer (IMASKP) to reflect the current 
interrupt nesting state. The nesting mode (NESTM) bit in the MODE1 
register determines whether all interrupts or only lower priority 
interrupts are masked during the service routine. 

All interrupt service routines, except for reset, should end with a return- 
from-interrupt (RTI) instruction. After reset, the PC stack is empty, so 
there is no return address. The last instruction of a reset service routine 
should be a jump to the start of user code. 

3.6.1 Interrupt Latency 

The ADSP-21020 responds to interrupts in three stages: synchronization 
and latching (1 cycle), recognition (1 cycle), and branching to the interrupt 
vector (2 cycles). See Figure 3.10 on the next page. If an interrupt is forced 
in software by a write to a bit in IRPTL, it is recognized in the following 
cycle, and the two cycles of branching to the interrupt vector follow that. 
Chapter 9 contains a discussion of synchronization for external interrupts 
and other asynchronous signals. 
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Certain ADSP-21020 operations that span more than one cycle hold off 
interrupts. If an interrupt occurs during one of these operations, it is 
synchronized and latched, but its processing is delayed. The operations 
that delay interrupt processing are: 

• a branch (call, jump or return) and the following cycle, whether it is an 
instruction (in a delayed branch) or no-operation (in a non-delayed 
branch) 
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Figure 3.10 Interrupt Handling 



• the first of the two cycles needed to perform a program memory data 
access and an instruction fetch (when there is an instruction cache 
miss). 

• the third-to-last iteration of a one-instruction loop 

• the last iteration of a one-instruction loop executed once or twice or of 
a two-instruction loop executed once, and the following cycle (which is 
a no-operation) 

• the first of the two cycles needed to fetch and decode the first 
instruction of an interrupt routine 

• waitstates for memory accesses 

• bus grant 

The ADSP-21020 cannot service an interrupt unless it is executing 
instructions or in the IDLE state. Interrupts are sampled, but not serviced, 
during bus grant and while the processor is waiting for memory 
acknowledge. IDLE is a special instruction that halts the processor until an 
external interrupt or timer interrupt occurs. 

For most interrupts, internal and external, only one instruction is executed 
after the interrupt occurs and before the two instructions aborted while 
the processor fetches and decodes the first service routine instruction. 
Because of the one-cycle delay between an arithmetic exception and the 
STKY register update, however, there are two cycles after an arithmetic 
exception occurs before interrupt processing starts. 

3.6.2 Interrupt Latch 

The interrupt latch (IRPTL) register is a 32-bit register that latches 
interrupts generated by an external event (one of IRQ 3 _o) or an internal 
processor event (e.g., multiplier exception). This register contains any 
current interrupt or any pending interrupts. Because this register is 
readable and writeable, any interrupt except for reset can be set or cleared 
in software. Do not write to the reset bit (bit 1) in IRPTL because this puts 
the processor in an illegal state. 

IRPTL is cleared by a processor reset. 
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Table 3.3 shows the bits in IRPTL. The second column lists the address (in 
hexadecimal) of the interrupt vector. Each interrupt vector is separated by 
eight memory locations. The third column lists an interrupt mnemonic. 
This name is provided for convenience; it is not required by the assembler. 

Bit 


(IM) 

Address 

Name 

Function 

0 

0 x 00 


Reserved for emulation* 

1 

0x08 

RSTI 

Reset (read-only)* 

2 

0 x 10 


Reserved 

3 

0x18 

SOVFI 

Status stack or loop stack overflow or PC stack full 

4 

0 x 20 

TMZHI 

Timer =0 (high priority option) 

5 

0x28 

IRQ3I 

IRQ 3 asserted 

6 

0x30 

IRQ2I 

IRQ 2 asserted 

7 

0x38 

IRQ 11 

IRQl asserted 

8 

0x40 

IRQ0I 

IRQo asserted 

9 

0x48 


Reserved 

10 

0x50 


Reserved 

11 

0x58 

CB7I 

Circular buffer 7 overflow interrupt 

12 

0x60 

CB15I 

Circular buffer 15 overflow interrupt 

13 

0 x 68 


Reserved 

14 

0x70 

TMZLI 

Timer=0 (low priority option) 

15 

0x78 

FIXI 

Fixed-point overflow 

16 

0x80 

FLTOI 

Floating-point overflow exception 

17 

0 x 88 

FLTUI 

Floating-point underflow exception 

18 

0x90 

FLTII 

Floating-point invalid exception 

19-23 

0x98-0xB8 


Reserved 

24 

OxCO 

SFT0I 

User software interrupt 0 

25 

0xC8 

SFT1I 

User software interrupt 1 

26 

OxDO 

SFT2I 

User software interrupt 2 

27 

0xD8 

SFT3I 

User software interrupt 3 

28 

OxEO 

SFT4I 

User software interrupt 4 

29 

0xE8 

SFT5I 

User software interrupt 5 

30 

OxFO 

SFT 6 I 

User software interrupt 6 

31 

0xF8 

SFT7I 

User software interrupt 7 


* Nonmaskable 

Table 3.3 Interrupt Vectors and Priority 

3 . 6 . 2 . 1 Interrupt Priority 

The interrupt bits in IRPTL are ordered by priority. The interrupt priority 
is from 0 (highest) to 31 (lowest). Interrupt priority determines which 
interrupt is serviced first when two occur in the same cycle. It also 
determines which interrupts are nested when nesting is enabled (see 
"Interrupt Nesting and IMASKP," later in this chapter). 
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The arithmetic interrupts (fixed-point overflow and floating-point 
overflow, underflow and invalid operation) are determined from flags in 
the sticky status register (STKY). By reading these flags, the service 
routine for one of these interrupts can determine which condition caused 
the interrupt. The routine also has to clear the STKY bit so that the 
interrupt is not still active after the service routine is done. 

The timer reaching zero causes both interrupt 4 and interrupt 14. This 
feature allows you to choose the priority of the timer interrupt. Unmask 
the timer interrupt that has the priority you want, and leave the other one 
masked. Unmasking both interrupts results in two interrupts when the 
timer reaches zero. The processor would service the higher priority 
interrupt first, then the lower priority interrupt. 

3.6.2.2 Software Interrupts 

The ADSP-21020 provides software interrupts that emulate interrupt 
behavior but are activated through software instead of hardware. An 
instruction that sets one of bits 24-31 in IRPTL (either a BIT SET 
instruction or a write to IRPTL) activates a software interrupt. The 
ADSP-21020 branches to the corresponding interrupt routine if that 
interrupt is not masked and interrupts are enabled. 

3.6.3 Interrupt Masking And Control 

All interrupts except for reset can be enabled and disabled by the global 
interrupt enable bit, IRPTEN, bit 12 in the MODE1 register. This bit is 
cleared at reset. You must set this bit for interrupts to be enabled. 

3.6.3 . 1 Interrupt Mask 

All interrupts except for reset interrupt can be masked. Masked means the 
interrupt is disabled. Interrupts that are masked are still latched, so that if 
the interrupt is later unmasked, it is processed. Upon chip reset, all 
interrupts except reset are masked. 

The IMASK register controls interrupt masking. The bits in the IMASK 
register correspond directly to the same bits in the IRPTL register; for 
example, bit 10 in the IMASK register masks or unmasks the same 
interrupt latched by bit 10 in the IRPTL register. If a bit is set, its interrupt 
is unmasked (enabled); if the bit is cleared, the interrupt is masked 
(disabled). The IMASK register prevents the interrupts from being 
serviced but not from being latched in IRPTL for future recognition. 
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3.6.3,2 Interrupt Nesting & IMASKP 

The ADSP-21020 supports the nesting of one interrupt service routine 
inside another; that is, a service routine can be interrupted by a higher 
priority interrupt. This feature is controlled by the nesting mode bit 
(NESTM) in the MODE1 register. When the NESTM bit is a 0, an interrupt 
service routine cannot be interrupted; any interrupt that occurs will be 
processed only after the routine finishes. When NESTM is a 1, higher 
priority interrupts can interrupt if they are not masked; lower or equal 
priority interrupts cannot. The NESTM bit should only be changed 
outside of an interrupt service routine or during the reset service routine; 
otherwise, interrupt nesting may not work correctly. 

In nesting mode, the ADSP-21020 uses the interrupt mask pointer 
(IMASKP) to create a temporary interrupt mask for each level of interrupt 
nesting; the IMASK value is not affected. The ADSP-21020 changes 
IMASKP each time a higher priority interrupt interrupts a lower priority 
service routine. 

The bits in IMASKP correspond to the interrupts in order of priority. 
When an interrupt occurs, its bit is set in IMASKP. If nesting is enabled, a 
new temporary interrupt mask is generated by masking all interrupts of 
equal or lower priority to the highest priority bit set in IMASKP (and 
keeping higher priority interrupts the same as in IMASK). When a return 
from an interrupt service routine is executed, the highest priority bit set in 
IMASKP is cleared, and again a new temporary interrupt mask is 
generated by masking all interrupts of equal or lower priority to the 
highest priority bit set in IMASKP. The bit set in IMASKP that has the 
highest priority always corresponds to the priority of the interrupt being 
serviced. 

If nesting is not enabled, the processor masks out all interrupts and 
IMASKP is not used, although IMASKP is still updated to create a 
temporary interrupt mask. 

An interrupt routine cannot be nested within itself. The ADSP-21020 
ignores and does not latch an interrupt that occurs while its service 
routine is already executing. 

3.6.4 Status Stack 

For low-overhead interrupt servicing, the ADSP-21020 automatically 
saves and restores the status and mode contexts of the interrupted 
program. The four external interrupts and the timer interrupt 
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automatically push AST AT and MODE1 onto the status stack, which is 
five levels deep. These registers are automatically popped from the status 
stack by the interrupt return (RTI) instruction. Other interrupts require 
explicit save and restore of the appropriate registers to data memory or 
program memory. 

Pushing ASTAT and MODE1 preserves the status and control bit settings 
so that if the service routine alters these bits, the original settings are 
automatically restored upon the return from interrupt. Note, however, 
that the Flag bits in ASTAT are not affected by status stack pushes and 
pops; the values of these bits carry over from the main program to the 
service routine and from the service routine to the main program. 

The top of the status stack contains the current values of ASTAT and 
MODEL Reading and writing these registers does not move the stack 
pointer. The stack pointer is moved, however, by explicit PUSH and POP 
instructions. 

3.6.5 External Interrupt Timing & Sensitivity 

Each of the four ADSP-21020 external interrupts, IRQ3-0, can be either 
level- or edge-triggered. 

The ADSP-21020 samples interrupts once every CLKIN cycle. Level- 
sensitive interrupts are considered valid if sampled active (low). A level- 
sensitive interrupt must go inactive (high) before the processor returns 
from the interrupt service routine. If a level-sensitive interrupt is still 
active when the processor samples it, the processor treats it as a new 
request, repeating the same interrupt routine without returning to the 
main program (assuming no higher priority interrupts are active). 

Edge-triggered interrupt requests are considered valid if sampled high in 
one cycle and low in the next. The interrupt can stay active indefinitely. To 
request another interrupt, the signal must go high, then low again. 

Edge-triggered interrupts require less external hardware compared to 
level-sensitive requests since there is never a need to negate the request. 
However, multiple interrupting devices may share a single level-sensitive 
request line on a wired-OR basis, which allows for easy system expansion. 
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A bit for each interrupt in the MODE2 register indicates the sensitivity 
mode of each interrupt. 

MODE2 

Bit Name Definition 

0 IRQOE IRQO l=edge sensitive; 0=level-sensitive 

1 IRQ1E IRQ1 l=edge sensitive; 0=level-sensitive 

2 IRQ2E 1RQ2 1 =edge sensitive; 0=level-sensitive 

3 IRQ3E IRQ3 1 =edge sensitive; 0=level-sensitive 

Interrupts are sampled during a bus grant, but remain pending until the 
processor regains control of the bus and continues program execution. At 
that time pending interrupts are serviced in order of priority. 

3.6.5 . 1 Asynchronous External Interrupts 

The processor accepts interrupts that are asynchronous to the ADSP-21020 
clock; that is, an interrupt signal may change at any time. An 
asynchronous interrupt must be held low at least one CLKIN cycle to 
guarantee that it gets sampled. The delay associated with synchronizing 
asynchronous signals is discussed in Chapter 9. Synchronous interrupts 
need only meet the setup and hold time requirements relative to the rising 
edge of CLKIN. 


3.7 STACK FLAGS 

The STKY register maintains stack full and stack empty flags for the PC 
stack as well as overflow and empty flags for the status stack and loop 
stack. Unlike other STKY bits, several of these flag bits are not "sticky." 
They are set by the occurrence of the condition they indicate and are 
cleared when the condition is changed (by a push, pop or processor reset). 


Bit 

Name 

Definition 

Sticky /Not Sticky 

Cleared By 

21 

PCFL 

PC stack full 

Not sticky 

Pop 

22 

PCEM 

PC stack empty 

Not sticky 

Push 

23 

SSOV 

Status stack overflow 

Sticky 

RESET 

24 

SSEM 

Status stack empty 

Not sticky 

Push 

25 

LSOV 

Loop stacks overflow* 

Sticky 

RESET 

26 

LSEM 

Loop stacks empty* 

Not sticky 

Push 


Loop address stack and loop counter stack 




1 1 ! 


Sequencing 


4 % 

mL 
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The status stack flags are read-only. Writes to the STKY register have no 
effect on these bits. 

The overflow and full flags are provided for diagnostic aid and are not 
intended to allow recovery from overflow. Status stack or loop stack 
overflow or PC stack full causes an interrupt. 

The empty flags facilitate off-chip stack saves. You monitor the empty flag 
when saving a stack to the external memory to know when all values have 
been transferred. The empty flags do not cause interrupts because an 
empty stack is an acceptable condition. 


3.8 IDLE 

IDLE is a special instruction that halts the processor in a low-power state 
until an external interrupt or timer interrupt occurs. When the processor 
encounters an IDLE instruction, it fetches the instruction at the fetch 
address and then holds its outputs in the states shown in Table 3.4. 


Output State 

PMA23-0 Next Fetch Address 

PMD47-0 High Impedance 

PMS1-0 Driven; value depends on address (one is high, the other low) 

PMRD High 

PMWK High 

PMPAGE Driven; value depends on address 

DMA31-0 Driven; value undefined but stable 

DMD39-0 High Impedance 

DM53AJ High 

DMRD High 

DMWR High 

DMPAGE Low 

FLAG3-0 Depends on internal state 

BG Depends on BR 

TIMEXP Depends on internal state 

TDO Depends on TR5T, TCK and internal state 


Table 3.4 States of Outputs During IDLE 



The clock continues to run during IDLE, as well as the timer if enabled. 
When an interrupt occurs, either externally or from the timer, the 
processor responds as normal, outputting the interrupt vector. After two 
cycles needed to fetch and decode the first instruction of the interrupt 
routine, the processor continues executing instructions normally. On 
return from the interrupt, execution continues at the instruction after the 
IDLE instruction. 


3.9 INSTRUCTION CACHE 

The instruction cache is a 2-way, set-associative cache with entries for 32 
instructions. The operation of the cache is transparent to the programmer. 
The ADSP-21020 caches only instructions that conflict with program 
memory data accesses. This feature makes the cache considerably more 
efficient than a cache that loads every instruction, because typically only a 
few instructions access data from program memory. 

Because of the three-stage instruction pipeline, if the instruction at address 
n requires a program memory data access, there is a conflict with the 
instruction fetch at address n+2, assuming sequential execution. It is this 
fetched instruction (n+ 2) that is stored in the instruction cache, not the 
instruction requiring the program memory data access. 

If the instruction needed is in the cache, a "cache hit" occurs — the cache 
provides the instruction while the program memory data access is 
performed. If the instruction needed is not in the cache, a "cache miss" 
occurs, and the external instruction fetch takes place in the cycle following 
the program memory data access, incurring one cycle of overhead. This 
instruction is loaded into the cache, if the cache is enabled and not frozen, 
so that it is available the next time the same instruction requiring program 
memory data is executed. 

3.9.1 Cache Architecture 

Figure 3.11 is a block diagram of the instruction cache. The instruction 
cache contains 32 entries. An entry consists of a register pair containing an 
instruction and its address. Each entry has a Valid bit that is set if the 
entry contains a valid instruction. 

The entries are divided into 16 sets (set 15-set 0) of two entries each, entry 
0 and entry 1 . Each set has an LRU (Least Recently Used) bit whose value 
indicates which of the two entries contains the least recently used 
instruction (l=entry 1, 0=entry 0). 
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Valid Bit 


LRU Bit 
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Set 1 O 


Set 2 O 
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Set 13 O 
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Figure 3.1 1 Instruction Cache Architecture 

Every possible instruction address is mapped to a set in the cache by its 4 
LSBs. When the processor needs to fetch an instruction from the cache, it 
uses the 4 address LSBs as an index to a particular set. Within that set, it 
checks the addresses of the two entries to see whether either contains the 
needed instruction. A cache hit occurs if the instruction is found, and the 
LRU bit is updated if necessary to indicate the entry that did not contain 
the needed instruction. 

A cache miss occurs if neither entry in the set contains the needed 
instruction. In this case, a new instruction and its address are loaded into 
the least recently used entry of the set that matches the 4 LSBs of the 
address. The LRU bit is toggled to indicate that the other entry in the set is 
now the least recently used. 

Because instructions are mapped to sets by their 4 address LSBs, there is 
no need to store these bits in the cache; the 4 LSBs are implied by the set in 
which the instruction has been stored. Only bits 23-4 are actually stored in 
a cache entry. 
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3.9.2 Cache Efficiency 

Usually, the cache operation and its efficiency is not a concern. However, 
there are some situations that can degrade cache efficiency and can be 
remedied easily in the program. 

When a cache miss occurs, the needed instruction is loaded into the cache 
so that if the same instruction is needed again, it will be there (a cache hit 
will occur). However, if another instruction whose address is mapped to 
the same set displaces this instruction, there will be a cache miss instead. 
The LRU bits help to reduce this possibility since at least two other 
instructions mapped to the same set must be needed before an instruction 
is displaced. If three instructions mapped to the same set are all needed 
repeatedly, cache efficiency (hit rate) can go to zero. The solution is to 
move one or more of the instructions to a new address, one that is 
mapped to a different set. 

An example of some code that is cache-inefficient is shown in Figure 3.12. 
The program memory data access at address 0x101 in the tight loop causes 
the instruction at 0x103 to be cached (in set 3). Each time the subroutine 
sub is called, the program memory data accesses at 0x201 and 0x211 
displace this instruction by loading the instructions at 0x203 and 0x213 
into set 3. If the subroutine is called only rarely during the loop execution, 
the impact will be minimal. If the subroutine is called frequently, the effect 
will be noticeable. If the execution of the loop is time-critical, it would be 
advisable to move the subroutine up one location (starting at 0x201), so 
that the two cached instructions end up in set 4 instead of 3. 

3.9.3 Cache Enable And Cache Freeze 

Freezing the cache prevents any changes to its contents; i.e., a cache miss 
will not result in a new instruction being stored in the cache. Disabling the 
cache stops its operation completely; all instruction fetches conflicting 
with program memory data accesses are delayed by the access. These 
functions are selected by the CADIS and CAFRZ (cache enable/ disable 
and cache freeze) bits in the MODE2 register. The cache is cleared 
(contains no instructions), unfrozen and enabled after a reset. 


MODE2 

Bit 

Name 

Function 

4 

CADIS 

Cache disable 

19 

CAFRZ 

Cache freeze 
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Address Instruction 

100 LCNTR=1024, DO tight UNTIL LCE; 

101 R0=DM (10, MO) , PM (18, M8 ) =F3; 

102 Rl=R0-R15; 

103 IF EQ CALL (sub); 

104 F2=FLOAT Rl; 

105 F3=F2 *F2 ; 

106 tight: F3=F3+F4; 

107 PM (18, M8 ) =F3 ; 




200 

201 


sub : 


R1=R13 ; 

R14=PM (19, M9) ; 


211 


21F 


PM (19, M9 ) =R12 ; 


RTS; 


Figure 3.12 Cache-Inefficient Code 



Data Addressing 


4.1 OVERVIEW 

The ADSP-21020/21010's two data address generators (DAGs) simplify 
the task of organizing data by maintaining pointers into memory. The 
DAGs allow the processor to address memory indirectly ; that is, an 
instruction specifies a DAG register containing an address instead of the 
address value itself. 

Data address generator 1 (DAG1) produces 32-bit addresses for data 
memory. Data address generator 2 (DAG2) produces 24-bit addresses for 
program memory. The basic architecture for both DAGs is shown in 
Figure 4.1, which can be found on the following page. 

The DAGs also support in hardware some functions commonly used in 
digital signal processing algorithms. Both DAGs support circular buffers, 
which require advancing a pointer repetitively through a range of 
memory. DAG1 can also perform a bit-reverse operation, which places the 
bits of an address in reverse order to form a new address. 


4.2 DAG REGISTERS 

Each DAG contains four types of registers: Index (I), Modify (M), Base (B) 
registers, and Length (L) registers. 

An I register acts as a pointer to memory, and an M register contains the 
increment value for advancing the pointer. By modifying an I register 
with different M values, you can vary the increment as needed. 

B registers and L registers are used only for circular data buffers. A 
B register holds the base (starting) address of a circular buffer. The same- 
numbered L register holds the number of locations in (i.e. the length of) 
the circular buffer. 



Each DAG contains eight of each type of register: 


DAG1 registers (32-bit) 

B0-B7 

10-17 

M0-M7 

L0-L7 


DAG2 registers (24-bit) 
B8 - B15 
18 - 115 
M8 - M15 
L8 - LI 5 


DAG1 : N = 32 
DAG2: N = 24 



Figure 4.1 Data Address Generator Block Diagram 
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4.2.1 Alternate DAG Registers 

Each DAG register has an alternate register for context switching. For 
activating alternate registers, each DAG is organized into high and low 
halves, as shown in Figure 4.2. The high half of DAG1 contains the I, M, B 
and L registers numbered 4-7, and the low half, the registers numbered 0- 
3. Likewise, the high half of DAG2 consists of registers 12-15, and the low 
half consists of registers 8-11. 


DAG1 Registers (Data Memory) 


M0DE1 

































































Bits in the MODE1 register determine for each half whether primary or 
alternate registers are active (0=primary registers active, 
l=alternate registers active): 


MODE1 

Bit Name 

3 SRD1H 

4 SRD1L 

5 SRD2H 

6 SRD2L 


Definition 

DAG1 alternate register select (4-7) 
DAG1 alternate register select (0-3) 
DAG2 alternate register select (12-15) 
DAG2 alternate register select (8-11) 


This grouping of alternate registers lets you pass pointers between 
contexts in each DAG. 


4.3 DAG OPERATION 

DAG operations include: 

• address output and modification, 

• modulo addressing (for circular buffers), and 

• bit-reversed addressing 

4.3.1 Address Output And Modification 

The processor can add an offset (modifier), either an M register or an 
immediate value, to an I register and output the resulting address; this is 
called a pre-modify without update operation. Or it can output the I register 
value as it is, and then add an M register or immediate value to form a 
new I register value. This is a post-modify operation. These operations are 
compared in Figure 4.3. The pre-modify operation does not change the 
value of the I register. The width of an immediate modifier depends on 
the instruction; it can be as much as the width of the I register. The L 
register and modulo logic do not affect a pre-modified address — 
pre-modify addressing is always linear, not circular. 

4.3.1 .1 DAG Modify Instructions 

In ADSP-21020/21010 assembly language, pre-modify and post-modify 
operations are distinguished by the positions of the index and modifier (M 
register or immediate value) in the instruction. The I register before the 
modifier indicates a post-modify operation. If the modifier comes first, a 
pre-modify without update operation is indicated. The following 
instruction, for example, accesses the program memory location with an 
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PRE-MODIFY 

POST-MODIFY 


Without I Register Update 

With I Register Update 


PM (Mx, lx) 

PM (lx, Mx) 


DM (Mx, lx) 

DM (lx, Mx) 



1. output 2. update 


1 i 1 

f ~ h— 



+ 

+ 

■ 


M ! 

1 M | 





Wm 




■ 


1 I + M 

1 I + M I ' 


output 




Figure 4.3 Pre-Modify and Post-Modify Operations 


address equal to the value stored in 115, and the value I15 + M12is written 
back to the 115 register: 

R6 = PM (115, Ml 2) ; Indirect addressing with post-modify 

If the order of the I and M registers is switched, however, 

R6 = PM (Ml 2, 115); Indirect addressing with pre-modify 

the instruction accesses the location in program memory with an address 
equal to 115 + M12, but does not change the value of 115. 

Any M register can modify any I register within the same DAG (DAGO or 
DAG1). Thus, 

DM (MO , 12 ) = TPERIOD; 

is a legal instruction that accesses the data memory location MO + 12; 
however, 

DM (MO, 114) = TPERIOD; 

is not a legal instruction because the I and M registers belong to different 
DAGs. 










4.3. 1 .2 Immediate Modifiers 

The magnitude of an immediate value that can modify an I register 
depends on the instruction type and whether the I register is in DAG1 or 
DAG2. DAG1 modify values can be up to 32 bits wide; DAG2 modify 
values can be up to 24 bits wide. Some instructions with parallel 
operations only allow modify values up to 6 bits wide. Here are two 
examples: 

32-bit modifier : 

R1=DM (0x40000000, II) ; DM address = 11 + 0x4000 0000 


6-blt modifier: 

F6=F1+F2, PM (18, OxOB) = AS TAT; PM address = 18, 18 = 18 + OxOB 

4.3.2 Circular Buffer Addressing 

The DAGs provide for addressing of locations within a circular data 
buffer. A circular buffer is a set of memory locations that stores data. An 
index pointer steps through the buffer, being post-modified and updated 
by the addition of a specified value (positive or negative) for each step. If 
the modified address pointer falls outside the buffer, the length of the 
buffer is subtracted from or added to the value, as required to wrap the 
index pointer back to the start of the buffer (see Figure 4.4). There is no 
restriction on the value of the base address for a circular buffer. 

Circular buffer addressing must use M registers for post-modify of I 
registers, not pre-modify; for example: 

F 1 =DM ( 1 0 , M0 ) ; Use post-modify addressing for circular buffers , 

F1=DM (M0, 10) ; not pre-modify. 

4.32. 1 Circular Buffer Operation 

You set up a circular buffer in assembly language by initializing an 
L register with a positive, nonzero value and loading the corresponding 
(same-numbered) B register with the base (starting) address of the buffer. 
The corresponding I register is automatically loaded with this same 
starting address. 

On the first post-modify access using the I register, the DAG outputs the I 
register value on the address bus and then modifies it by adding the 
specified M register or immediate value to it. If the modified value is 



Length = 11 
Base address = 0 
Modifier (step size) = 4 


Sequence shows order in which locations are accessed in one pass. 
Sequence repeats on subsequent passes. 


0 

1 

0 


0 


0 


1 


1 

4 

1 


1 


2 


4 (2 


2 

7 

2 


3 


/ 3 


4z 


3 

10 

4 

2 

/ 4 


/ 4 


4 4 


5 


/ 5 

5 

/ 5 


/ 5 


6 


/ 6 


/ 6 

8 

/ 6 


7 


/ 7 


j 7 



11 

8 

3 

8 


/ 8 


/ 8 


9 


9 

6 

' 9 


/ 9 


10 


10 


10 

9 

' 10 



Figure 4.4 Circular Data Buffers 


within the buffer range, it is written back to the I register. If the value is 
outside the buffer range, the L register value is subtracted (or, if the 
modify value is negative, added) first. 


If M is positive. 


I = I + M 

new old 


U = i„ u + m-l 

ifL + M 

If M is negative. 


J new = I oM + M 

ifL + M 

I = I + M + L 

new old 

ifL + M 


< Buffer base + length (end of buffer) 
> Buffer base + length (end of buffer) 


> Buffer base (start of buffer) 
< Buffer base (start of buffer) 




A 
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4.322 Circular Buffer Registers 

All four types of DAG registers are involved in the operation of a circular 
buffer: 

• The I register contains the value which is output on the address bus. 

• The M register contains the post-modify amount (positive or negative) 
which is added to the I register at the end of each memory access. The 
M register can be any M register in the same DAG as the I register and 
does not have to have the same number. The modify value can also be 
an immediate number instead of an M register. The magnitude of the 
modify value, whether from an M register or immediate, must be less 
than the length (L register) of the circular buffer. 

• The L register sets the size of the circular buffer and thus the address 
range that the I register is allowed to circulate through. L must be 
positive and cannot have a value greater than 2 31 - 1 (L0-L7) or 2 23 - 1 
(L8-L15). If an L register's value is zero, its circular buffer operation is 
disabled. 

• The B register, or the B register plus the L register, is the value that the 
modified I value is compared to after each access. When the B register 
is loaded, the corresponding I register is simultaneously loaded with 
the same value. When I is loaded, B is not changed. B and I can be read 
independently. 

4.32.3 Circular Buffer Overflow Interrupts 

There is one set of DAG registers for each memory space that can generate 
an interrupt upon circular buffer overflow (i.e. address wraparound). For 
data memory, the registers are B7, 17, L7, and for program memory they 
are B15, 115, L15. Circular buffer overflow interrupts can be used to 
implement a ping-pong (swap I/O buffer pointers) routine, for example. 

Whenever a circular buffer addressing operation using these registers 
causes the address in the I register to be incremented (or decremented) 
past the end (or start) of the circular buffer, an interrupt is generated. 
Depending on which register set was used, the interrupt is either: 

DAG Registers Vector Symbolic 

Interrupt To Use Address Name * 

DAG1 circular buffer 7 overflow B7, 17, L7 0x58 CB7I 

DAG2 circular buffer 15 overflow B15, 115, L15 0x60 CB15I 
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* These symbols are defined in the #include file def21020.h. See Listing 8.5 in the 
section 'Initialization Following Reset (Initial Setups)" of Chapter 8, Programmer's 
Tutorial, or the AD SP -21020/21010 Programmer's Quick Reference. 



Specifically, an interrupt is generated during an instruction's address 
post-modify when: 

(for M<0) I + M < B 

(for M>0) I + M > B + L 

The interrupts can be masked by clearing the appropriate bit in IMASK. 

There may be situations where you want to use 17 or 115 without circular 
buffering but with the circular buffer overflow interrupts unmasked. To 
disable the generation of these interrupts, set the B7/B15 and L7/L15 
registers to values that assure that the conditions that generate interrupts 
(as specified above) never occur. For example, when accessing the address 
range 0x1000-0x2000, your program could set B=0x0000 and L=0xFFFF. 
Note that setting the L register to zero will not achieve the desired results. 

If you are using either of the circular buffer overflow interrupts, you 
should avoid using the corresponding I register(s) (17 and/ or 115) in the 
rest of your program, or be careful to set the B and L registers as described 
above to prevent spurious interrupt branching. 

The STKY status register includes two bits that also indicate the 
occurrence of a circular buffer overflow, bit 17 (DAG1 circular buffer 7 
overflow) and bit 18 (DAG2 circular buffer 15 overflow). Rather then 
remaining set until explicitly cleared, however, these bits are cleared by 
the next subsequent memory access that uses the corresponding 
I register (17, 115). Circular buffer interrupts, therefore, should be used 
instead of these STKY register bits. 

4.3.3 Bit-Reversal 

Bit-reversal of data memory addresses can be performed in two ways: by 
enabling the bit-reverse mode of DAG1 and using a specific I register (10), 
or by executing the explicit bit-reverse instruction (BITREV). 

4.3.3 . 1 Bit-Reverse Mode 

In bit-reverse mode, DAG1 bit-reverses 32-bit address values output from 
10. This mode is enabled by the BR0 bit in the MODE1 register. Only 
address values from 10 are bit-reversed. This mode affects both pre-modify 
and post-modify operations. 

MODE1 

Bit Name Definition 

1 BR0 Bit-reverse for 10 (uses DMS 0 only) 



Important: Due to timing constraints, addresses output in bit-reverse 
mode always activate DMS 0 (Data Memory Select 0) and the number of 
wait states associated with it, regardless of the actual address value. In 
most systems, this means that a bit-reversed address must be within the 
lowest bank of data memory space. (See Chapter 6 for more information 
on memory banks.) 

Bit-reversal occurs at the output of DAG1 and does not affect the value in 
10. In the case of a post-modify operation, the update value is not bit- 
reversed. However, after a data memory access using 10, you can read the 
bit-reversed address from universal register DMADR, which holds the last 
data memory address output. 

Example: 

10=0x80400000 ; 

R1 =DM (10,3) ; DM address = 0x201 , 10 - 0x8040 0003 

4.3.32 Bit-Reverse Instruction 

The BITREV instruction modifies and bit-reverses addresses in any DAG1 
index register (10-17) without accessing external data memory. This 
instruction is independent of the bit-reverse mode (BR0 bit in MODE1). 
The BITREV instruction adds a 32-bit immediate value to a DAG1 index 
register, bit-reverses the result and writes the result back to the same 
index register. The bit-reversed value appears on the data memory 
address bus, but no strobes are active. 

Example: 

B I TRE V (11,4); II = Bit-reverse of (II + 4) 


4.4 DAG REGISTER TRANSFERS 

DAG registers are part of the universal register set and may be written 
from data memory, another universal register or an immediate field in an 
instruction. Their contents may be written to data memory or a universal 
register. 

Transfers between 32-bit DAG1 registers (7-0) and the 40-bit DMD bus are 
aligned to bits 39-8 of the DMD bus. When 24-bit DAG2 registers (15-8) 
are read to the 40-bit DMD bus, M register values are sign-extended to 32 
bits, and I, L, and B register values are zero-filled to 32 bits. The results are 
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aligned to bits 39-8 of the DMD bus. When DAG2 registers are written 
from the DMD bus, bits 31-8 are transferred and the rest are ignored. 

Figure 4.5 illustrates these transfers. 

4.4.1 DAG Register Transfer Restrictions 

For certain instruction sequences involving transfers to and from DAG 

registers, an extra (NOP) cycle is either automatically inserted by the H 

processor (1, 2) or must be inserted in code by the programmer (3). Certain 
other sequences cause incorrect results and are not allowed by the 
ADSP-21020/21010 Assembler (4). 

1.) When an instruction that loads a DAG register is followed by an 
instruction that uses any register in the same DAG for data addressing, 
the ADSP-21020/21010 inserts an extra (NOP) cycle between the two 
instructions. This happens because the same bus is needed by both 
operations in the same cycle, therefore the second operation must be 
delayed. An example is: 


L2 = 8 ; 

DM ( 10 , Ml ) =R1 ; 

Because L2 is in the same DAG as 10 (and Ml), an extra cycle is inserted 
after the write to L2. 


39 

7 0 

39 

23 

7 0 

1 p: | ....... ... 

! 

8 ZEROS 

8 ZEROS 

! 

8 ZEROS 


1 PAGl Register (7-0) H | DAG2 I, L, or B Register (15-8) 1 


39 23 

7 0 

39 

23 

7 

0 

1 FHEm 1 1 

t 

8 ZEROS 


1 



1 8 SIGN BITS| DAG2 M Register (15-8) 

□ 


j DAG2 M Register (15-8) 

Z1 



Figure 4.5 DAG Register Transfers 
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4 Data Addressing 


2. ) For the same reason, the ADSP-21020/21010 also inserts an extra cycle 
after an instruction that writes a memory control register if it is followed 
by an instruction that uses a register in the corresponding DAG (DAG1 for 
data memory control registers, DAG2 for program memory control 
registers). Data memory control registers are DMWAIT, DMBANK1-3 and 
DMADR. Program memory control registers are PMWAIT, PMBANK1 
and PM ADR. (Note that because the DAG2 registers are used to fetch 
instructions or access data in every cycle, a write to a program memory 
control register will always require an extra cycle to be inserted.) 

Each of the following instruction sequences, for example, 

PMWAI T=0x0 8 0 0 0 0 ; or DMBANKl=0xl0000000 ; 

NOP/ R15=DM ( 10 , Ml ) ; 

cause the ADSP-21020 to insert an extra cycle between the two 
instructions. 

3. ) An instruction that writes any L or M register of DAG2 

(L8-L15, M8-M15), immediately followed by an instruction that reads the 
corresponding I register will result in incorrect data being read from the 
I register. The following instruction sequence, for example, 

L8=24 ; 

R0=I8 ; 

will cause incorrect data to be read from 18. To prevent this, add a NOP to 
your program between the two instructions (i.e. the L or M register write 
and the I register read): 

L8=24 ; 

NOP; 

R0=I8 ; 
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4.) The following kinds of instructions can execute on the processor, but 
cause incorrect results; these instructions are disallowed by the ADSP- 
21020/21010 Assembler: 

• An instruction that stores a DAG register in memory using indirect 
addressing from the same DAG, with or without update of the index 
register. The instruction writes the wrong data to memory or updates 
the wrong index register. 

Examples: 

DM (M2, II) -10; or DM (II, M2) =10; 

• An instruction that loads a DAG register from memory using indirect 
addressing from the same DAG, with update of the index register. The 
instruction will either load the DAG register or update the index 
register, but not both. 

Example: 

L2=DM(I1,M0) ; 
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5.1 OVERVIEW 

The ADSP-21 020/21 010 has a programmable interval timer that can 
generate periodic interrupts. You program the timer by writing two 
universal registers, and you control timer operation through a bit in the 
MODE2 register. An external output, TIMEXP, signals to other devices 
that the timer count has expired. 


5.2 TIMER OPERATION 

Figure 5.1, on the next page, shows a block diagram of the timer. Two 
universal registers, TPERIOD and TCOUNT, control the timer interval. 

Register Function Bits 

TPERIOD Timer Period Register 32 

TCOUNT Timer Counter Register 32 

The TCOUNT register contains the timer counter. The timer decrements 
the TCOUNT register each clock cycle. When the TCOUNT value reaches 
zero, the timer generates an interrupt and asserts the TIMEXP output high 
for 4 cycles (see Figure 5.2, also on the next page). On the next clock cycle 
after TCOUNT reaches zero, the timer automatically reloads TCOUNT 
from the TPERIOD register. 

The TPERIOD value specifies the frequency of timer interrupts. The 
number of cycles between interrupts is TPERIOD + 1 . The maximum 
value of TPERIOD is 2 32 - 1, so if the clock cycle is 50 ns, the maximum 
interval between interrupts is 214.75 seconds. 

5.2.1 Timer Enable And Disable 

To start and stop the timer, you enable and disable it through a bit in the 
MODE2 register. With the timer disabled, you load TCOUNT with an 
initial count value and TPERIOD with the number of cycles for the 
interval you want. Then you enable the timer when you want to begin the 
count. 






At reset, the timer enable bit in the MODE2 register is cleared, so the timer 
is disabled. When the timer is disabled, it does not decrement the 
TCOUNT register and it generates no interrupts. When the timer enable 
bit is set, the timer starts decrementing the TCOUNT register at the end of 
the next clock cycle. If the bit is subsequently cleared, the timer is disabled 
and stops decrementing TCOUNT after the next clock cycle (see Figure 5.3). 

MODE2 

Bit Name Definition 

5 TIMEN Timer enable 

TIMER ENABLE 

Set TIMEN in MODE2 Timer Active 



CLOCK 








TCOUNT = N 

TCOUNT = N 

TCOUNT = N-1 


TIMER DISABLE 


Clear TIMEN in MODE2 Timer Inactive 


CLOCK | 

| TCOUNT = M-1 | TCOUNT = M-2 [ TCOUNT = M-2 

I \ 

Figure 5.3 Timer Enable and Disable 
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5.2.2 Timer Interrupt 

When the value of TCOUNT reaches zero, the timer generates two 
interrupts, one with a relatively high priority, the other with a relatively 
low priority. At reset, both are masked. You should unmask only the 
timer interrupt that has the priority you want, and leave the other 


masked. 



IRPTL Name 

Vector 

Function 

bit 



4 TMZHI 

0x20 

Timer =0 (high priority option) 

14 TMZLI 

0x70 

Timer=0 (low priority option) 


Interrupt priority determines which interrupt is serviced first when two 
occur in the same cycle. It also affects interrupt nesting; when nesting is 
enabled, only higher priority interrupts can interrupt a service routine. 


Like other interrupts, the timer interrupt requires two cycles to fetch and 
decode the first service routine instruction. The service routine begins 
executing four cycles after the timer count goes to zero, as shown in 
Figure 5.4. 

CLOCK I I I I I 1 I 1 I I I 1 I 



! 

NOP 

NOP 

EXECUTE 

TCOUNT = 1 

j TCOUNT = 0 

(FETCH) 

(DECODE) 

FIRST 

SERVICE 


INSTRUCTION 


~ Z3CII3<CII3CZIXZZXZZX 

t 

TIMER 

INTERRUPT 

VECTOR 

Figure 5.4 Timer Interrupt Timing 


5.3 TIMER REGISTERS 

Both the TPERIOD and TCOUNT registers can be read and written 
through universal register transfers. Reading the registers has no effect on 
the timer function. An explicit write to TCOUNT has priority over both 
the loading of TCOUNT from TPERIOD and the decrementing of 
TCOUNT. 

Neither TCOUNT nor TPERIOD are affected by a reset, so you should 
initialize both registers after reset before enabling the timer. 
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6.1 MEMORY MANAGEMENT AND INTERFACE 

This chapter describes the memory management and interface capabilities 
of the ADSP-21020/21010. In addition. Chapter 9 shows several example 
systems with different memory configurations. 

The ADSP-21020/21010 has two distinct but similar memory interfaces: 
one for program memory, which contains both instructions and data, and 
one for data memory, which contains data only. The processor is capable 
of connecting to a number of different memory devices and memory- 
mapped peripherals. Minimal external hardware is required for a variety 
of configurations. 

The ADSP-21020/21010 provides on-chip memory management. Program 
and data memory spaces are user-configurable into banks (two for 
program memory, four for data memory). Wait states for each bank are 
independently programmable. The processor also detects page boundaries 
to facilitate memory paging. 

The bus request/bus grant protocol allows an external device to take 
control of the processor's memory buses. This is useful for transferring 
data to its memory, for example. The ADSP-21020/21010 also has an 
internal bus exchange path for transferring data between the program 
memory and data memory spaces. 
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6.2 MEMORY BUSES AND CONTROL PINS 

The ADSP-21020/21010 accesses program memory through its program 
memory interface. Two types of accesses occur across the program 
memory buses: instruction fetches and program memory data accesses. 
The program memory interface consists of the following pins: 


Pin Type Definition 

PMA 23 0 O Program Memory Address. The ADSP-21020/21010 outputs 
an address in program memory on these pins. 

PMD 47 0 I/O Program Memory Data. The ADSP-21020/21010 inputs and 
outputs data and instructions on these pins. 32-bit fixed-point 
data and 32-bit single-precision floating-point data is 
transferred over bits 47-16 of the PMD bus. 


PMS 10 O Program Memory Select lines 1 & 0. These pins are asserted 
as chip selects for the corresponding banks of program 
memory. Memory banks must be defined in the processor's 
memory control registers. These pins are decoded program 
memory address lines and provide an early indication of a 
possible bus cycle. 


PMRD O Program Memory Read strobe. This pin is asserted when the 
ADSP-21020/21010 reads from program memory. 


PMWR O Program Memory Write strobe. This pin is asserted when the 

ADSP-21020/21010 writes to program memory. 

PMACK I Program Memory Acknowledge. An external device deasserts 

this input to add wait states to a memory access. 


PMPAGE O Program Memory Page Boundary. The ADSP-21020/21010 
asserts this pin to signal that a program memory page 
boundary has been crossed. Memory pages must be defined 
in the processor's memory control registers. 


PMTS I Program Memory Three-State Control. PMTS places the 

program memory a ddress , data, selects, and strobes in a high- 
impedance state. If PMTS is asserted while a PM access is in 
progress, the processor will halt and the memory access will 
not be com pleted. PMACK must be asserted for at least one 
cycle when PMTS is deasserted to all ow any pending 
memory access to complete properly. PMTS should only be 
asserted (low) during an active memory access cycle. 


0=0utput, I=Input. When groups of pins are identified with subscripts, 
e.g. PMD 47 0 , the highest numbered pin is the MSB (in this case, PMD 47 ). 
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The ADSP-21020/21010 accesses data memory through its data memory 
interface. The data memory interface consists of the following pins: 


Pin Type 

DMA 31 , O 

dmd 39 . 0 I/O 

dmd 31 . 0 I/O 

dms 3 _ 0 o 


DMRD O 

DMWR O 

DMACK I 

DMPAGE O 

DMTS I 


Definition 

Data Memory Address. The ADSP-21020/21010 outputs an 
address in data memory on these pins. 

Data Memory Data (ADSP-21020). The ADSP-21020 inputs 
and outputs data on these pins. 32 -bit fixed-point data and 
32-bit single-precision floating-point data is transferred over 
bits 39-8 of the DMD bus. 

Data Memory Data (ADSP-21010). The ADSP-21010 inputs 
and outputs data on these pins. (DMD 310 on the ADSP-21010 
corresponds to DMD 398 on the ADSP-21020. This should be 
taken into account if upgrading is planned.) 

Data Memory Select lines 0, 1, 2, & 3. These pins are asserted 
as chip selects for the corresponding banks of data memory. 
Memory banks must be defined in the processor's memory 
control registers. These pins are decoded data memory 
address lines and provide an early indication of a possible 
bus cycle. 

Data Memory Read strobe. This pin is asserted when the 
ADSP-21020/21010 reads from data memory. 

Data Memory Write strobe. This pin is asserted when the 
ADSP-21020/21010 writes to data memory. 

Data Memory Acknowledge. An external device deasserts 
this input to add wait states to a memory access. 

Data Memory Page Boundary. The ADSP-21020/21010 
asserts this pin to signal that a data memory page boundary 
has been crossed. Memory pages must be defined in the 
processor's memory control registers. 


Data Memory Three-State Control. DMTS places the data 
memory address, d ata, sel ects, and strobes in a high- 
impedance state. If DMTS is asserted while a DM access is in 
progress, the processor will halt and the memory access will 
not be com pleted. DMACK must be asserted for at least one 
cycle when DMTS is deasserted to all ow any pending 
memory access to complete properly. DMTS should only be 
asserted (low) during an active memory access cycle. 



0=0utput, I=Input. When groups of pins are identified with subscripts, 
e.g. DMD 39 _ 0 , the highest numbered pin is the MSB (in this case, DMD 39 ). 
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6.3 MEMORY INTERFACE TIMING 

This section describes the relative timing of memory interface signals 
during memory accesses. The descriptions apply to both program 
memory and data memory accesses. The following generic signal names 
represent the memory control signals; the signals actually used in a 
particular access depend on whether program memory or data memory is 
being accessed and which bank contains the address. 


Signal Name 

Address 

Data 

Memory select 
Read strobe 
Write strobe 
Acknowledge 


Program Memory 

pma 23 . 0 

pmd 4 , 0 

PMRD 

PMWR 

PMACK 


Data Memory 

DMA,, 

dmd 39 . 0 

DM# n 

DMRD 

DMWR 

DMACK 


6.3.1 Memory Read 

Memory reads occur with the following sequence of events (see Figure 6.1): 


1. The ADSP-21020 drives the read address and asserts a memory select 
signal to indicate the selected bank. A memory select signal is not 
deasserted between successive accesses of the same memory bank. 

2. The ADSP-21020 asserts the read strobe (unless the memory access is 
aborted because of a conditional instruction). 

3. The ADSP-21020 checks whether wait states are needed. If so, the 
memory select and read strobe remain active for additional cycle(s). 
Wait states are determined by the state of the external acknowledge 
signal, the internally programmed wait state count, or a combination 
of the two (see "Wait States," later in this chapter). 

4. The ADSP-21020 latches in the data. 


5. The ADSP-21020 deasserts the read strobe. 


6. If initiating another memory access, the ADSP-21020 drives the 
address and memory select for the next cycle. 

Note that if a memory read is part of a conditional instruction that is not 
executed because the condition is false, the ADSP-21020 still drives the 
address and memory select for the read, but does not assert the read 
strobe or read any data. 
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6.3.2 Memory Write 

Memory writes occur with the following sequence of events (see Figure 6.2, 

on the following page): 

1. The ADSP-21020 drives the write address and asserts a memory select 
signal to indicate the selected bank. A memory select signal is not 
deasserted between successive accesses of the same memory bank. 

2. The ADSP-21020 asserts the write strobe and drives the data (unless 
the memory access is aborted because of a conditional instruction). 

3. The ADSP-21020 checks whether wait states are needed. If so, the 
memory select and read strobe remain active for additional cycle(s). 
Wait states are determined by the state of the external acknowledge 
signal, the internally programmed wait state count, or a combination 
of the two (see "Wait States," later in this chapter). 

4. The ADSP-21020 deasserts the write strobe near the end of the cycle. 

5. The ADSP-21020 tristates its data outputs. 

6. If initiating another memory access, the ADSP-21020 drives the 
address and memory select for the next cycle. 
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Figure 6.2 Memory Write Cycle 


Note that if a memory write is part of a conditional instruction that is not 
executed because the condition is false, the ADSP-21020 still drives the 
address and memory select for the write, but does not assert the write 
strobe or drive any data. 

6.3.3 Three-State Controls 

The memory bus three-state enables, DMTS and PMTS, prevent the 
ADSP-21020 from driving its external data memory port and program 
memory port, respectively. The corresponding acknowledge signal 
(DMACK or PMACK) is not sampled when a three-state enable is active. 
DMTS and PMTS allow an external device, such as a cache controller, to 
take control of the memory interface by first asserting a three-state enable 
to keep the ADSP-21020 from driving the bus. When the external device 
deasserts the three-state enable, the processor resumes driving its memory 
port if it was executing a memory access. 

These controls facilitate the implementation of an external cache system. 
When the the processor tries to access data that is not in the cache, the 
controller asserts the three-state control, writes the needed data into cache, 
then releases the three-state control so that the ADSP-21020 can finish the 
data access. Unlike with bus request (BK), the ADSP-21020 does not finish 
its current instruction before it three-states the memory port. 
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You must use memory acknowledge (DMACK or PMACK) in conjunction 
with DMTS or PMTS, whether or not the memory requires wait states. The 
first reason is that there must be an extra cycle after the three-state enable 
is deasserted for the memory cycle to complete; the acknowledge should 
be deasserted (low) in the same cycle that the three-state enable is 
deasserted in. The second reason is that the ADSP-21020 counts wait states 
whether or not the memory outputs are enabled, so internally 
programmed wait states cannot be used. If memory requires wait states, 
use DMACK or PMACK to insert them after the extra cycle that follows 
the deassertion of the three-state enable. 

DMTS and PMTS do not halt the ADSP-21020 if the processor does not 
require a memory access from the three-stated port. In practice, PMTS will 
halt the ADSP-21020 because either an instruction fetch or program 
memory data access needs to occur every cycle. When only DMTS is 
asserted, however, the processor can continue running until it reaches a 
data memory access. 

DMTS controls the following pins: 

dma 31 , 

DMD„ n (DMD„ n on the ADSP-21010) 

DM50 

DM5T 

DM52 

DM53 

DMJRD 

DMWR 

DMPAGE 

PMTS controls the following pins: 

PMA 230 

-o 

FM5T 

PMKD 

PMWR 

PMPAGE 


pmd 47 

PM50 



6-7 


Memory Interface 


6.4 MEMORY BANKS 

Each address space on the ADSP-21020 can be divided into banks for 
selection. The program memory address space is divided into two banks. 
The data memory address space is divided into four banks. The relative 
size of these banks is under user control through the registers PMBANK1, 
DMBANK1, DMBANK2 and DMBANK3. 

Bank 0 of program memory spans address 0 up to but not including the 
value in the 24-bit PMBANK1 register. Program memory bank 1 starts at 
the value in PMBANK1 and runs to the end of program memory space. 
Each bank has a separate memory select pin (PMS 0 and PMS^ that is 
asserted when the ADSP-21020 outputs an address in the corresponding 
memory bank. Wait states for the two banks are independently controlled, 
as described under "Extending Memory Cycles with Wait States." 

Similarly, bank 0 of data memory spans address 0 up to but not including 
the value in the 32-bit DMBANK1 register. Bank 1 runs from the value in 
DMBANK1 up to the value in DMBANK2, bank 2 from DMBANK2 up to 
DMBANK3, and bank 3 from DMBANK3 to the end of data memory 
space. For proper operation, the address in DMBANK2 should be greater 
than or equal to that in DMBANK1, and the address in DMBANK3 should 
be greater than or equal to that in DMBANK2. As in program memory, 
each data memory bank has its own memory select pin and independent 
wait state control. 


Note: When bit-reverse mode is enabled (bit 1 in MODE1 is set), data 
memory accesses that use 10 will activate DMS 0 and insert the number of 
wait states programmed for bank 0, regardless of the value of the bit- 
reversed address. In most systems, this means that bit-reversed mode can 
only be used to access bank 0. 


If a memory access is aborted (because of a non-delayed branch, for 
example), the memory select signal may be asserted even though there is 
no memory access. 

At reset, the memory bank address registers contain values as follows: 


Register 

PMBANK1 

DMBANK1 

DMBANK2 

DMBANK3 


Value at Reset 

0x800000 

0x20000000 

0x40000000 

0x80000000 
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6.5 WAIT STATES (EXTENDED MEMORY CYCLES) 

To simplify the interface to slow off-chip peripherals and slow memories, 
the ADSP-21020 allows a variety of methods for extending off-chip 
memory accesses. 

• External. The ADSP-21020 samples its acknowledge input (DMACK or 
PMACK) in each clock cycle. If it latches a low value, it inserts a wait 
state by holding strobes and address on the interface valid an 
additional cycle. If the value is high, the ADSP-21020 completes the 
cycle. 

• Internal. The ADSP-21020 ignores the acknowledge input. Three bits in 
a control register specify the number of wait states (zero to seven) for 
the access. You can specify a different number for each bank of 
memory. 

• Both. The ADSP-21020 samples its acknowledge input in each clock 
cycle. If it latches a low value, the ADSP-21020 inserts a wait state. If 
the value is high, the ADSP-21020 completes the cycle only if the 
number of wait states specified internally have expired. In this mode, 
the internal wait states specify a minimum number of cycles per access, 
and an external device can use the acknowledge pin to extend the 
access as necessary. 

• Either. The ADSP-21020 completes the cycle as soon as it samples the 
acknowledge input high or the internally specified number of wait 
states have expired, whichever occurs first. In this mode, a system with 
two types of peripherals could shorten the cycle for the faster 
peripheral using the acknowledge but use the internal wait states for 
the slower peripheral. 

The method selected for each bank of memory is independent of the other 
banks. Thus, you can map different speed devices into different memory 
banks for the appropriate wait state control. 
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Two bits specify the wait state method and three bits specify the number 
of wait states (zero to seven) for each bank of each memory space. These 
control bits are located in the DMWAIT and PMWAIT registers, shown in 
Figure 6.3. The mode bits are decoded as follows: 


Mode 

Description 

00 

External acknowledge only 

01 

Internal wait states only 

1 0 

Both internal and external acknowledge 

1 1 

Either internal or external acknowledge 


Wait state control for the program memory interface is determined by the 
following bits in the PMWAIT register: 

PMWAIT 

Bits Function 

9-7 Number (in binary) of program memory bank 1 wait states 

6-5 Wait state mode for program memory bank 1 

4-2 Number (in binary) of program memory bank 0 wait states 

I- 0 Wait state mode for program memory bank 0 

Wait state control for the data memory interface is determined by the 
following bits in the DMWAIT register: 

DMWAIT 

Bits Function 

19-17 Number (in binary) of data memory bank 3 wait states 

16-15 Wait state mode for data memory bank 3 

14-12 Number (in binary) of data memory bank 2 wait states 

II- 10 Wait state mode for data memory bank 2 

9-7 Number (in binary) of data memory bank 1 wait states 

6-5 Wait state mode for data memory bank 1 

4-2 Number (in binary) of data memory bank 0 wait states 

1-0 Wait state mode for data memory bank 0 

6.5.1 Extended Data Memory Address Hold Time 

The ADSP-21020 holds its data memory address outputs from one cycle to 
the next until the next data memory access or BITREV instruction causes 
the address to change. This feature simplifies the interface to peripherals 
requiring long address hold times. The programmer ensures that the next 
data memory access or BITREV instruction occurs after the address hold 
requirement has been met. For example, inserting a NOP after a data memory 
access instruction guarantees that the address will be held for two cycles. 
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DMWAIT Register 


31 

30 

29 

28 

27 

26 

25 

24 

23 

22 

21 

20 

19 

18 

17 

16 

15 

0 

0 

0 

0 

0 

0 

0 

0 

0 

E 

0 

□ 

□ 

□ 

0 

□ 

□ 


Automatic DRAM Data 

wait state memory 

on boundary page sizef 

crossing 


Bank 3 Bank 3 

number of wait state 

wait states mode* 


14 

13 

12 

11 

10 

9 

8 

7 

6 

5 

4 

3 

2 

1 

0 

0 

□ 

□ 

□ 

0 

□ 

□ 

□ 

0 

0 

□ 

0 

0 

0 

0 


1 

r 

1 

1 

1 


00 

01 

Bank 2 

Bank 2 

Bank 1 

Bank 1 

Bank 0 

Bank 0 

10 

number of 

wait state 

number of 

wait state 

number of 

wait state 

11 

wait states 

mode* 

wait states 

mode* 

wait states 

mode* 



t DRAM Memory Page Size Codes 

000 

256 Words 

001 

512 Words 

010 

1024 Words 

011 

2048 Words 

100 

4096 Words 

101 

8192 Words 

110 

16384 Words 

111 

32768 Words 


* Wait State Mode Codes 


External acknowledge only 
Internal wait states only 
Both external and internal required 
Either external or internal sufficient 


PMWAIT Register 


31 

30 

29 

28 

27 

26 

25 

24 

23 

22 

21 

20 

19 

18 

17 

16 

15 

14 

0 

0 

0 

0 

0 

0 

0 

m 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


13 12 11 10 9 


0 

a 

0 

0 

0 

□ 

0 

0 

0 

□ 

0 

0 

0 

0 


DRAM Bank 1 

Program memory number of 
page sizef wait states 


Bankl 
wait state 
mode* 


Automatic 
wait state 
on boundary 
crossing 


Bank 0 Bank 0 

number of wait state 

wait states mode* 


t DRAM Memory Page Size Codes 

000 

256 Words 

001 

512 Words 

010 

1024 Words 

011 

2048 Words 

100 

4096 Words 

101 

8192 Words 

110 

16384 Words 

111 

32768 Words 


* Wait State Mode Codes 

00 

External acknowledge only 

01 

Internal wait states only 

10 

Both external and internal required 

11 

Either external or internal sufficient 


Figure 6.3 Wait State Control Registers 
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6.6 MEMORY PAGE BOUNDARY DETECTION 

Applications that use large amounts of data may need to use dynamic 
RAMs (DRAMs) for bulk storage. To simplify its interface to page-mode 
DRAMs, the ADSP-21020 detects page boundary crossings and outputs a 
signal to a DRAM controller. Page boundaries are user-defined and are 
determined by page size fields in the wait state control registers. 

The ADSP-21020 detects boundary crossings by comparing each address it 
outputs to the one just previously output (in the same memory space). For 
instance, if a page is 1024 words long, any of the lower ten address bits 
can change from one address to the next without addressing a different 
page. If the upper 22 (for data memory) or 14 (for program memory) 
address bits are the same, the addresses are in the same memory page. If 
any of the upper bits change, however, the ADSP-21020 signals a page 
boundary crossing by asserting the PMPAGE pin (for program memory) 
or the DMPAGE pin (for data memory). 

If the memory access that crosses a page boundary is aborted (because of a 
non-delayed branch, for example), the PAGE pin may be asserted even 
though there is no memory access. In this case, the ADSP-21020 retains the 
address from the previous access to use for comparison; that is, the ADSP- 
21020 recognizes that the boundary was not actually crossed. 

The PAGE pin is asserted only on the first access to a page. It is always 
asserted on the first memory access after a reset. The PAGE pin has the 
same timing as the address pins. 

It is important to remember that memory pages and memory banks are 
handled independently by the ADSP-21020. Page boundary checking 
works the same way whether the same bank or a different bank is being 
accessed. 

6.6.1 Page Size 

Three bits in PMWAIT and three in DMWAIT specify the memory page 
size in program memory and data memory, respectively: 

PMWAIT 

Bits Function 

12-10 Program memory page size 

DMWAIT 

Bits Function 

22-20 Data memory page size 



The page boundary detection logic interprets these bits according to the 
following table: 


Bit Values 
000 
001 
010 
011 
100 
101 
110 
111 


Page Size (Words) 
256 
512 
1024 
2048 
4096 
8192 
16384 
32768 


6.6.2 Wait States On Page Boundary Crossings 

One bit each in the DMWAIT and PMWAIT registers controls automatic 
wait state generation for page boundary crossings. If this bit is a 1, the 
ADSP-21020 inserts one wait state if the access crosses a page boundary 
and if the access would not otherwise include a wait state. If the access 
already includes at least one wait state, it is not affected. 


PMWAIT 

Bit Function 

13 l=automatic wait state for access across page boundary; 

0=no automatic wait state 


DMWAIT 

Bit Function 

23 l=automatic wait state for access across page boundary; 

0=no automatic wait state 


At reset, this bit is a 0 in both PMWAIT and DMWAIT, disabling 
automatic wait states. The ADSP-21020 always performs page boundary 
detection, whether or not the automatic wait states are enabled. 


6.7 BUS REQUEST/BUS GRANT 

The bus request (BK) and bus grant (BG) signals on the ADSP-21020 allow 
an external processor to gain control of the program memory and data 
memory buses in order to, for example, transfer data in or out of the 
ADSP-21 020's external memory. 


# 


BR is an input pin that requests access to the buses. When BR is asserted, 
the ADSP-21020 completes the current instruction and then places both 
data buses, both address buses, read and write strobes, PMPAGE, 
DMPAGE and all memory select pins in a high-impedance state. The 
ADSP-21020 then asserts BG to indicate to the requesting device that it is 
no longer driving its memory buses. The ADSP-21020 remains halted until 
BR is deasserted, signalling the release of the buses. The ADSP-21020 then 
continues from the instruction at which it halted. 

The bus grant operation is shown in Figure 6.4. Detailed timing 
requirements and characteristics are given in the ADSP-21020 Data Sheet. 

If BR is asserted and then deasserted (in the next cycle) before BG is 
asserted, the bus grant may or may not occur. For proper operation, BR 
should be held at least until BG goes low. 

Interrupts are sampled while the bus is granted, but remain pending until 
the bus request is released. When the processor continues program 
execution, pending interrupts are serviced in order of priority. 

While the ADSP-21020 is in reset, a bus request is recognized immediately 
because no instructions are being executed. The bus is granted just as in 
normal operation. The clock must be active for bus request to be 
recognized. 
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Figure 6.4 Bus Request/Bus Grant Timing 


* The ADSP-21020 will complete execution of this instruction before granting its buses. 
Most instructions require only a single cycle to complete, but additional cycles may be 
needed for memory waitstates, DMACK/PMACK, multicycle instructions such as 
delayed branches (DB), etc. 
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6.8 BUS EXCHANGE (PX REGISTERS) 

PX1 and PX2 are two registers used for transferring data between the 
48-bit PMD bus and 40-bit register file locations or the 40-bit DMD bus. 
PX1 is 16 bits wide, and PX2 is 32 bits wide. Either register can be read 
from or written to the PMD bus, the DMD bus or the register file. 

Data is aligned in PX register transfers as shown in Figure 6.5. When data 
is transferred between PX2 and the PMD bus, the upper 32 bits of the 
PMD bus are used. On transfers from PX2, the 16 LSBs of the PMD bus are 
filled with zeros. When data is transferred between PX1 and the PMD bus, 
the middle 16 bits of the PMD bus are used. On transfers from PX1, bits 
15-0 and bits 47-32 are filled with zeros. 


PMD TRANSFERS DMD OR REGISTER FILE TRANSFERS 



PX REGISTER 


Figure 6.5 PX Register Transfers 
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When data is transferred between PX2 and the DMD bus or the register 
file, the upper 32 bits of the DMD bus or the register file are used. On 
transfers from PX2, the eight LSBs are filled with zeros. When data is 
transferred between PX1 and the DMD bus or the register file, bits 23-8 of 
the DMD bus or the register file are used. On transfers from PX1, bits 7-0 
and bits 39-24 are filled with zeros. 

PX1 and PX2 can also be treated as a single PX register, but only for reads 
from and writes to program memory via the PMD bus. This allows the PX 
pair to contain the entire 48 bits coming from or going to program 
memory. PX2 contains the 32 MSBs of the 48-bit word while PX1 contains 
the 16 LSBs. (Program memory data is 40 bits wide and left-justified in the 
48-bit word.) 

To write a 48-bit word to the program memory location named Portl, for 
example, the following instructions would be used: 


R0=0x9A00; 

R1 =0x1234567 8; 
PX1=R0 ; 

PX2=R1; 

PM (Portl) =PX; 


/* load R0 with 16 LSBs */ 
/* load R1 with 32 MSBs */ 


/* write 16 LSBs to PM bits 15-0 */ 
/* and 32 MSBs to PM bits 47-16 */ 



Instruction Summary 


7.1 OVERVIEW 

This section describes the ADSP-21000 Family instruction set in brief. For 
more information, see Appendix A, Instruction Set Reference. 

The instructions are grouped into four categories: 

I. Compute and Move or Modify 

II. Program Flow Control 

III. Immediate Move 

IV. Miscellaneous 

The instructions are numbered; there are 22. Some instructions have more 
than one syntactical form; for example. Instruction 4 has four distinct 
forms. The instruction number has no bearing on programming, but 
corresponds to the opcode recognized by the ADSP-21 020/21 010 device. 

This section also contains several reference tables for using the instruction 
set. 


• Table 7.1 describes the notation and abbreviations used in this section. 

• Table 7.2 lists all condition and termination code mnemonics. 

• Table 7.3 lists all register mnemonics. 

• Tables 7.4 through 7.7 list the syntax for all compute operations 
(ALU, multiplier, shifter or multifunction). 

• Table 7.8 lists interrupts and their vectors. 
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7.2 IMPORTANT PROGRAMMING REMINDERS 

This section summarizes information about the operation of the 
ADSP-21020/21010 that you should keep in mind when writing 
programs. Use it as a checklist for verifying that your program will 
execute as you intend. 

7.2.1 Extra Cycle Conditions 

All instructions can execute in a single clock period but may take longer in 
some cases. These cases are described in the following sections. 

7.2. 1 . 1 Nondelayed Branches 

A nondelayed branch instruction (JUMP, CALL, RTS or RTI) fetches but 
does not execute the two instructions that follow it in program memory. 
Instead, these operations are aborted and the processor executes two 
NOPs. 

This two-cycle delay can be avoided by using delayed branches, which 
execute the two instructions following the branch instruction. The tradeoff 
is that the actual program flow does not match the apparent order of 
operations in the program; you must remember that the two extra 
instructions are executed before the branch is taken. 

7.2. 1.2 Program Memory Data Access With Cache Miss 

The ADSP-21020 checks the instruction cache on every program memory 
data access. If the instruction needed is in the cache, the instruction fetch 
from the cache happens in parallel with the program memory data access 
and the instruction executes in a single cycle. However, if the instruction 
is not in the cache, the ADSP-21020 must wait for the program memory 
data access to complete before it can fetch the next instruction. This results 
in a minimum one-cycle delay, more if the program memory data access 
uses wait states. 

7.2. 1.3 Program Memory Data Access In Loops 

The ADSP-21020 caches an instruction that it needs to fetch during the 
execution of a program memory data access. Because of the execution 
pipeline, this instruction is usually two memory locations after the 
program memory data access. If the program memory data access is in a 
loop, there will usually be a cache miss on the first iteration of the loop 
and cache hits on subsequent iterations, for a total of one extra cycle 
during the loop execution. 
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However, there are certain cases in which different instructions are 
needed from the cache at different iterations. In these cases the number of 
cache misses, and therefore extra cycles, increases. These situations are 
summarized below. Note that this table is based on the worst-case 
scenario; the actual performance of the cache for a given program may be 
better. 

Cache Loop Length 

Misses (Instructions) 

1 >2 

2 >2 

3 1 

e = loop end address 

Two Misses: If the program memory data access occurs in the last two 
instructions of a loop, there will usually be cache misses on the first and 
the last loop iteration, for a total of two extra cycles. On the first iteration, 
the ADSP-21020 needs to fetch from the top of the loop (the first or second 
instruction). On the last iteration, the ADSP-21020 needs to fetch one of 
the two instructions following the loop. At each of these points there will 
be a cache miss the first time the code containing the loop is executed. 

Three Misses: If a loop contains only one instruction, and that instruction 
requires a program memory data access, there are potentially three cache 
misses. On the first iteration, the processor needs to fetch the loop 
instruction again (if the loop iterates three times or more). On the next-to- 
last iteration, the processor needs to fetch the instruction following the 
loop. On the last instruction, the processor needs to fetch the second 
instruction following the loop. In each case, there will be a cache miss the 
first time the code containing the loop is executed. 

7 . 2 . 1.4 One - And Two-Instruction Loops 

Counter-based loops that have only one or two instructions can cause 
delays if not executed a minimum number of times. The ADSP-21020 
checks the termination condition two cycles before it exits the loop. In 
these short loops, the ADSP-21020 has already looped back when the 
termination condition is tested. Thus, if the termination condition tests 
true, the two instructions in the pipeline must be aborted and NOPs 
executed instead. 

Specifically, a loop of length one executed one or two times or a loop of 
length two executed only once incurs two cycles of overhead because 
there are two aborted instructions after the last iteration. Note that these 


Location of Program 
Memory Data Access 
Not at e or (e- 1) 

At e or (e- 1) 

At the single loop location 




overhead cycles are in addition to any extra cycles caused by a program 
memory data access inside the loop (see previous section). To avoid 
overhead, use straight-line code instead of loops in these cases. 

7 . 2 . 7.5 DAG And Memory Control Register Writes 

When an instruction that loads a DAG register is followed by an 
instruction that uses any register in the same DAG for data addressing, 
the ADSP-21020 inserts an extra (NOP) cycle between the two 
instructions. This happens because the same bus is needed by both 
operations in the same cycle, therefore the second operation must be 
delayed. An example is: 

L2=8 ; 

DM ( I 0 , Ml ) =R1 ; 

For the same reason, the ADSP-21020 also inserts an extra cycle after an 
instruction that writes a memory control register if it is followed by an 
instruction that uses a register in the corresponding DAG (DAG1 for data 
memory control registers, DAG2 for program memory control registers). 
Data memory control registers are DM WAIT, DMBANK1-3 and DM ADR. 
Program memory control registers are PMWAIT, PMBANK1 and 
PMADR. (Note that because the DAG2 registers are used to fetch 
instructions or access data in every cycle, a write to a program memory 
control register will always require an extra cycle to be inserted.) 

Each of the following instruction sequences, for example, cause the 
ADSP-21020 to insert an extra cycle between the two instructions: 

PMWAIT=0x080000 ; or DMBANKl=Oxl 00000 00 ; 

NOP; R15=DM ( 10 , Ml ) ; 

An instruction that writes any L or M register of DAG2 (L8-L15, M8-M15), 
immediately followed by an instruction that reads the corresponding 
I register will result in incorrect data being read from the I register. The 
following instruction sequence, for example, will cause incorrect data to 
be read from 18: 

L8=24; 

R0=I8 ; 

To prevent this, add a NOP between the two instructions: 

L8=24; 

NOP; 

R0=I8 ; 
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7.2.1 .6 Wait States 

A memory access can be programmed to include a specific number of wait 
states and/or to wait for an external acknowledge signal before 
completing. If only internally programmed wait states are used, the delay 
is the number of wait states (1 wait state = 1 cycle). If the external 
acknowledge is used, either alone or in combination with programmed 
wait states, the delay depends on the external system and can vary. 

7.2. 1. 7 Page Boundary Crossing 

The page detection logic in the ADSP-21020 can be configured to add a 
wait state to any memory access that crosses a page boundary. This 
feature facilitates the external memory page control, which may require 
extra time on a change of page. If it does not, or if paging is not 
implemented, the extra wait state does not need to be configured. 

7.2. 1.8 Three-State Enables 

Both the program memory port and the data memory port include a three- 
state enable input that an external device can assert to hold the ADSP- 
21020 off the particular memory bus. See Chapter 6 for complete 
information on these controls. The ADSP-21020 continues to execute 
instructions while the three-state enable is active until it requires access to 
the memory bus. At that point, the ADSP-21020 must wait, executing 
NOPs, until the three-state enable is deasserted. The delay depends on the 
external system and can vary. 

If the ADSP-21020 is accessing the memory when the corresponding three- 
state enable is asserted, it holds off completion of the memory cycle until 
the three-state enable is deasserted. DMACK or PMACK must be used to 
insert an extra cycle to complete the memory access. 

7.2. 1.9 Bus Request/Bus Grant 

As with the three-state enables, an external device can assert the ADSP- 
21020 bus request (BK) to gain control of the memory buses (in this case, 
both buses at once). The ADSP-21020 responds to a bus request by 
completing the current instruction, placing both memory ports in a high- 
impedance state, and asserting its bus grant (BG) output. It executes NOPs 
until the bus request is deasserted. As with the three-state enables, the 
delay depends on the external system and can vary. 

7.2.2 Delayed Branch Restrictions 

A delayed branch instruction and the two instructions that follow it in 
program memory must be executed sequentially. Any interrupt that 
occurs in between a delayed branch instruction and either of the two 
instructions that follow is not processed until the branch is complete. 
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Instructions in the two program memory locations immediately following 
a delayed branch instruction can not be any of the following: 

• Other Jumps, Calls or Returns 

• Pushes or Pops of the PC stack 

• Writes to the PC stack or PC stack pointer 

• DO UNTIL instruction 

• IDLE instruction 

These exceptions are checked by the assembler. 

7.2.3 Loop Restrictions 

If any of the final three instructions of a loop are a jump without loop 
abort, a call or a return, the loop may not be executed correctly. If an 
interrupt occurs during the execution of the last three instructions of a 
loop, its processing is delayed until after the last instruction is executed. 

The third-to-last instruction of a counter-based loop cannot be a write to 
CURLCNTR from external memory. 

A non-counter-based loop three instructions long completes one full 
iteration after the termination condition becomes true. If the loop has two 
instructions, one or two full iterations occur after the condition becomes 
true. If the loop has only one instruction, three more passes are executed 
after the termination condition becomes true. 

For no overhead, a counter-based loop of length one must be executed at 
least three times and a counter-based loop of length two must be executed 
at least twice. Loops of length one that iterate only once or twice and loops 
of length two that iterate only once incur two cycles of overhead because 
there are two aborted instructions after the last iteration. 

Nested loops cannot terminate on the same instruction. For nested loops 
in which the outer loop's termination condition is not LCE, the end 
address of the outer loop must be at least two locations after the end 
address of the inner loop. 

7.2.4 Interrupts 

ADSP-21020 operations that span more than one cycle are not allowed to 
be interrupted. If an interrupt occurs during one of these operations, it is 
synchronized and latched, but its recognition is delayed: 
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a branch (call, jump or return) and the following cycle, whether it is an 
instruction (in a delayed branch) or no-operation (in a non-delayed 
branch) 





g^y% 

. 




• the first of the two cycles needed to perform a program memory data 
access and an instruction fetch (when there is an instruction cache miss). 

• the third-to-last iteration of a one-instruction loop 

• the last iteration of a one-instruction loop executed twice and the 
following cycle (which is a no-operation) 

• the last iteration of a two-instruction loop executed only once and the 
following cycle (which is a no-operation) 

• the first of the two cycles needed to fetch and decode the first 
instruction of an interrupt routine 

For most interrupts, internal and external, only one instruction is executed 
after the interrupt occurs and before the two instructions aborted while 
the processor fetches and decodes the first service routine instruction. 
Because of the one-cycle delay between an arithmetic exception and the 
STKY register update, however, there are two cycles after an arithmetic 
exception occurs before interrupt processing starts. 

7.2.5 IRPTL 

IRPTL is in an indeterminate state at reset. You should clear IRPTL by 
writing zeros to it before enabling interrupts or unmasking any interrupt. 

7.2.6 Effect Latency And Read Latency 

Writes to some registers require an extra cycle before taking effect. This 
delay is called effect latency. Some registers require an extra cycle after a 
write before a read of the register yields the new value. This delay is called 
read latency. Effect latency and read latency for registers are listed below: 


Register 

Read 

Effect 

Name 

Latency 

Latency 

PCSTK 

0 

0 

PCSTKP 

1 

1 

LADDR 

0 

0 

CURLCNTR 

0 

0 

LCNTR 

0 

0 

MODE1 

0 

1 

MODE2 

0 

1 

IRPTL 

0 

0 

IMASK 

0 

1 

IMASKP 

1 

1 

ASTAT 

0 

1 

STKY 

0 

1 

USTAT1 

0 

0 

USTAT2 

0 

0 




7.2.7 CURLCNTR Write &LCE 

If an LCE instruction follows a write to CURLCNTR, the condition tested 
will be based on the old CURLCNTR value. This is because the write of 
CURLCNTR has a delay of one cycle. 

7.2.8 Circular Buffer Initialization 

You set up a circular buffer by initializing an L register with a positive, 
nonzero value and loading the corresponding (same-numbered) B register 
with the base (lowest) address of the buffer. The corresponding I register 
is automatically loaded with this same starting address. 

7.2.9 Bit-Reverse Mode And Data Memory Bank Select 

Due to timing constraints, addresses output in bit-reverse mode always 
activate DMS 0 (Data Memory Select 0) and the number of wait states 
associated with it, regardless of the actual address value. In most systems, 
this means that a bit-reversed address must be within the lowest bank of 
data memory space. 

7.2.10 Disallowed DAG Register Transfers 

The following instructions execute on the ADSP-21020, but cause incorrect 
results. These instructions are disallowed by the assembler: 

• An instruction that stores a DAG register in memory using indirect 
addressing from the same DAG, with or without update of the index 
register. The instruction writes the wrong data to memory or updates 
the wrong index register. 

DM (M2 , II) - 10; or DM(I1, M2) = 10; 

• An instruction that loads a DAG register from memory using indirect 
addressing from the same DAG, with update of the index register. The 
instruction will either load the DAG register or update the index 
register, but not both. 

L2 = DM (II, MO) ; 
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7.2.1 1 Two Writes To Register File 

If two writes to the same register file location take place in the same cycle, 
only the write with higher precedence actually occurs. Precedence is 
determined by the source of the data being written; from highest to 
lowest, the precedence is: 

• Data memory or universal register 

• Program memory 

• ALU 

• Multiplier 

• Shifter 

7.2.12 Stack Status Flags 

The stack overflow/ full and underflow flags in the STKY register are not 
"sticky." Writes to the STKY register have no effect on these bits. 

7.2.1 3 Wait States And Three-State Enables 

DMTS and PMTS should not be used in conjunction with internally 
programmed wait states alone. The ADSP-21020 counts wait states 
whether or not the memory outputs are enabled, so three-stating during a 
memory access could cause the access to have too few wait states. You 
should use memory acknowledge (DMACK or PMACK) to implement 
wait states, if needed. The acknowledge can be conditioned by the three- 
state enable so that the number of wait states depends on how long DMTS 
or PMTS is asserted. DMACK and PMACK are not sampled when the 
corresponding three-state enable is active. 

7.2.14 Computation Units 

In fixed-point to floating-point conversion, the rounding boundary is 
always 40 bits even if the RND32 bit is set. 

The ALU Zero flag (AZ) signifies floating-point underflow as well as a 
zero result. 

Transfers between MR registers and the register file are considered 
multiplier operations. 



Notation 

UPPERCASE 


r 

italics 

I between lines I 

<data n> 
<addrn> 
<reladdr n> 
<bit6>:<len6> 


compute 

shiftimm 

condition 

termination 

ureg 

sreg 

dreg 

Rn, Rx, Ry, Ra, Rm, Rs 

Fn, Fx, Fy, Fa, Fm, Fs 

R3-0 

R7-4 

Rll-8 

R15-12 

F3-0 

F7-4 

FI 1-8 

F15-12 


Meaning 

explicit syntax; assembler keyword 
instruction terminator 

separates parallel operations in an instruction 
optional part of instruction 
list of options (choose one) 

n-bit immediate data value 
n-bit immediate address value 
n - bit immediate PC-relative address value 
6-bit immediate bit position and length values 
(for shifter immediate operations) 

ALU, multiplier, shifter or multifunction operation 
(from Tables 7.4-7. 7) 

shifter immediate operation (from Table 7.6) 
status condition (from Table 7.2) 
termination condition (from Table7. 2) 
universal register (from Table 7.3) 
system register (from Table 7.3) 

R15-R0, F15-F0; register file location 

R15-R0; register file location, fixed-point 

F15-F0; register file location, floating-point 

R3, R2, Rl, RO 

R7, R6, R5, R4 

Rll, RIO, R9, R8 

R15, R14, R13, R12 

F3, F2, FI, FO 

F7, F6, F5, F4 

Fll, F10, F9, F8 

F15, F14, F13, F12 


la 17-10; DAG1 index register 

Mb M7-M0; DAG1 modify register 

Ic 115-18; DAG2 index register 

Md M15-M8; DAG2 modify register 


(DB) Delayed branch 

(LA) Loop abort (pop loop, PC stacks on branch) 


MROF Multiplier result accumulator 0, foreground 

MR1F Multiplier result accumulator 1, foreground 

MR2F Multiplier result accumulator 2, foreground 

MROB Multiplier result accumulator 0, background 

MR1B Multiplier result accumulator 1, background 

MR2B Multiplier result accumulator 2, background 


Table 7.1 Syntax Notation Conventions 
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Name 

EQ 

NE 

GE 

LT 

LE 

GT 

AC 

NOT AC 
AV 

NOT AV 
MV 

NOT MV 
MS 

NOT MS 

sv 

NOT SV 
SZ 

NOT SZ 
FLAGOJN 
NOT FLAGOJN 
FLAG1JN 
NOT FLAG1JN 
FLAG2JN 
NOT FLAG2JN 
FLAG3JN 
NOT FLAG3JN 
TF 

NOT TF 
LCE 

NOT LCE 
FOREVER 
TRUE 


Description 

ALU equal zero 

ALU not equal to zero 

ALU greater than or equal zero 

ALU less than zero 

ALU less than or equal zero 

ALU greater than zero 

ALU carry 

Not ALU carry 

ALU overflow 

Not ALU overflow 

Multiplier overflow 

Not multiplier overflow 

Multiplier sign 

Not multiplier sign 

Shifter overflow 

Not shifter overflow 

Shifter zero 

Not shifter zero 

Flag 0 

Not Flag 0 

Flagl 

Not Flag 1 

Flag 2 

Not Flag 2 

Flag 3 

Not Flag 3 

Bit test flag 

Not bit test flag 

Loop counter expired (DO UNTIL) 
Loop counter not expired (IF) 
Always False (DO UNTIL) 

Always True (IF) 


In a conditional instruction, the execution of the entire instruction is based on the 
specified condition. 


Table 7.2 Condition and Termination Codes 
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Mnemonic Contents 


PC* 

PCSTK 

PCSTKP 

FADDR* 

DADDR* 

LADDR 

CURLCNTR 

LCNTR 

R15-R0 

F15-F0 

115-18 

17-10 

ivil5-ivi8 

M7-M0 

L15-L8 

L7-L0 

B15-B8 

B7-B0 

DMWAIT 

DMBANK1 

DMBANK2 

DMBANK3 

DMADR 

PMWAIT 

PMBANK1 

PMADR 

PX 

PX1 

PX2 

TPERIOD 

TCOUNT 


program counter 

top of PC stack 

PC stack pointer 

fetch address 

decode address 

top of loop address stack 

top of loop count stack 

loop count for next loop 

register file locations (fixed-point data) 

register file locations (floating-point data) 

DAG2 index registers 
DAG1 index registers 
DAG2 modify registers 
DAG1 modify registers 
DAG2 length registers 
DAG1 length registers 
DAG2 base registers 
DAG1 base registers 

wait state and page size control for data memory 

data memory bank 1 lower boundary 

data memory bank 2 lower boundary 

data memory bank 3 lower boundary 

copy of last data memory address 

wait state and page size control for program memory 

program memory bank 1 lower boundary 

copy of last program memory address 

48-bit PX1 and PX2 combination 

bus exchange 1 (16 bits) 

bus exchange 2 (32 bits) 

timer period 

timer counter 


System Registers (these are also Universal Registers): 


MODE1 

MODE2 

IRPTL 

IMASK 

IMASKP 

ASTAT 

STKY 

USTAT1 

USTAT2 

* read-only 


mode control 1 
mode control 2 
interrupt latch 
interrupt mask 
interrupt mask pointer 
arithmetic status 
sticky status 
user status reg 1 
user status reg 2 


Table 7.3 Universal Registers and System Registers 
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The system register bit manipulation instruction can be used to set, clear, 
toggle or test specific bits in the system registers. This instruction is 
described in Appendix A, Group IV-Miscellaneous instructions. 

Examples: 

BIT SET MODE 2 0x00000070; 

BIT TST ASTAT 0x00002000; 


Fixed-point 

Rn = Rx + Ry 
Rn = Rx - Ry 

Rn = Rx + Ry, Rm = Rx - Ry 
Rn = Rx + Ry + Cl 
Rn = Rx - Ry + Cl - 1 
Rn = (Rx + Ry)/2 
COMP(Rx, Ry) 

Rn = -Rx 
Rn = ABS Rx 
Rn = PASS Rx 
Rn = MIN(Rx, Ry) 

Rn = MAX(Rx, Ry) 

Rn = CLIP Rx BY Ry 
Rn = Rx + Cl 
Rn = Rx + Cl - 1 
Rn = Rx + 1 
Rn = Rx - 1 
Rn = Rx AND Ry 
Rn = Rx OR Ry 
Rn = Rx XOR Ry 
Rn = NOT Rx 


Floating-point 

Fn = Fx + Fy 
Fn = Fx - Fy 

Fn = Fx + Fy, Fm = Fx - Fy 
Fn = ABS (Fx + Fy) 

Fn = ABS (Fx - Fy) 

Fn = (Fx + Fy)/2 
COMP(Fx, Fy) 

Fn = -Fx 
Fn = ABS Fx 
Fn = PASS Fx 
Fn = MIN(Fx, Fy) 

Fn = MAX(Fx, Fy) 

Fn = CLIP Fx BY Fy 
Fn = RND Fx 
Fn = SCALE Fx BY Ry 
Rn = MANT Fx 
Rn = LOGB Fx 
Rn = FIX Fx BY Ry 
Rn = FIX Fx 
Fn = FLOAT Rx BY Ry 
Fn = FLOAT Rx 
Fn = RECIPS Fx 
Fn = RSQRTS Fx 
Fn = Fx COPYSIGN Fy 


Table 7.4 ALU Instructions 


Rn 

= Rx * Ry ( 

s 

s 

F 

MRF 

u 

u 

I 

MRB 




FR 


Fn = Fx * Fy 


Rn = MRF 

+ Rx * Ry ( 

S 

s 

F 

) 

Rn = MRF 

- Rx * Ry ( 

S 

s 

Rn = MRB 

U 

u 

I 


Rn = MRB 

u 

u 

MRF = MRF 




FR 


MRF = MRF 




MRB = MRB 1 






MRB = MRB 





Rn = SAT MRF 


(SI) 


Rn = RNDMRF! 

i 

(SF) 

Rn = SAT MRB 


(UI) 


Rn = RNDMRB 


(UF) 

MRF = SAT MRF 


(SF) 


MRF = RNDMRF 


MRB = SAT MRB 


(UF) 1 


MRB = RNDMRB 



MRF - 0 
MRB 


MRxF 

- Rn 

Rn - 

MRxF 

MRxB' 



MRxB 


( □ □ □ ) 



S Signed input 
U Unsigned input 

I Integer input(s) 

F Fractional input(s) 

FR Fractional inputs. Rounded output 


Rn, Rx, Ry 
Fn, Fx, Fy 
MRxF 
MRxB 
(SF) 

(SSF) 


R15-R0; register file location, fixed-point 

F15-F0; register file location, floating-point 

MR2F, MR1F, MROF; multiplier result accumulators, foreground 

MR2B, MR1B, MROB; multiplier result accumulators, background 

Default format for 1 -input operations 

Default format for 2-input operations 


Table 7.5 Multiplier Instructions 
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Shifter 

Rn = LSHIFT Rx BY Ry 

Rn = Rn OR LSHIFT Rx BY Ry 

Rn = ASHIFT Rx BY Ry 

Rn = Rn OR ASHIFT Rx BY Ry 

Rn = ROT Rx BY RY 

Rn = BCLR Rx BY Ry 

Rn = BSET Rx BY Ry 

Rn = BTGL Rx BY Ry 

BTST Rx BY Ry 

Rn = FDEP Rx BY Ry 

Rn = Rn OR FDEP Rx BY Ry 

Rn = FDEP Rx BY Ry (SE) 

Rn = Rn OR FDEP Rx BY Ry (SE) 
Rn = FEXT Rx BY Ry 
Rn = FEXT Rx BY Ry (SE) 

Rn = EXP Rx 
Rn = EXP Rx (EX) 

Rn = LEFTZ Rx 
Rn = LEFTO Rx 


Shifter Immediate 

Rn = LSHIFT Rx BY <data8> 

Rn = Rn OR LSHIFT Rx BY <data8> 

Rn = ASHIFT Rx BY <data8> 

Rn = Rn OR ASHIFT Rx BY <data8> 

Rn = ROT Rx BY <data8> 

Rn = BCLR Rx BY <data8> 

Rn = BSET Rx BY <data8> 

Rn = BTGL Rx BY <data8> 

BTST Rx BY <data8> 

Rn = FDEP Rx BY <bit6>:<len6> 

Rn = Rn OR FDEP Rx BY <bit6>:<len6> 

Rn = FDEP Rx BY <bit6>:<len6> (SE) 

Rn = Rn OR FDEP Rx BY <bit6>:<len6> (SE) 
Rn = FEXT Rx BY <bit6>:<len6> 

Rn = FEXT Rx BY <bit6>:<len6> (SE) 


Table 7.6 Shifter and Shifter Immediate Instructions 




struction 



wU 


I f 



Fixed-point 

Rm=R3-0 * R7-4 (SSFR), 
Rm=R3-0 * R7-4 (SSFR), 
Rm=R3-0 * R7-4 (SSFR), 
MRF=MRF + R3-0 * R7-4 (SSF), 
MRF=MRF + R3-0 * R7-4 (SSF), 
MRF=MRF + R3-0 * R7-4 (SSF), 
Rm=MRF + R3-0 * R7-4 (SSFR), 
Rm=MRF + R3-0 * R7-4 (SSFR), 
Rm=MRF + R3-0 * R7-4 (SSFR), 
MRF=MRF - R3-0 * R7-4 (SSF), 
MRF=MRF - R3-0 * R7-4 (SSF), 
MRF=MRF - R3-0 * R7-4 (SSF), 
Rm=MRF - R3-0 * R7-4 (SSFR), 
Rm=MRF - R3-0 * R7-4 (SSFR), 
Rm=MRF - R3-0 * R7-4 (SSFR), 
Rm=R3-0 * R7-4 (SSFR), 


Ra=Rll-8 + R15-12 

Ra=Rll-8 - R15-12 

Ra=(Rll-8 + R15-12)/2 

Ra=Rll-8 + R15-12 

Ra=Rll-8 - R15-12 

Ra=(Rll-8 + R15-12)/2 

Ra=Rll-8 + R15-12 

Ra-Rll-8 - R15-12 

Ra=(Rll-8 + R15-12)/2 

Ra=Rll-8 + R15-12 

Ra=Rll-8 - R15-12 

Ra=(Rll-8 + R15-12)/2 

Ra=Rll-8 + R15-12 

Ra=Rl 1-8 -R1 5-12 

Ra=(Rll-8 + R15-12)/2 

Ra=Rll-8 + R15-12, Rs=Rll-8 - R15-12 


Floating-point 

Fm=F3-0 * F7-4, 
Fm=F3-0 * F7-4, 
Fm=F3-0 * F7-4, 
Fm=F3-0 * F7-4, 
Fm=F3-0 * F7-4, 
Fm=F3-0 * F7-4, 
Fm=F3-0 * F7-4, 
Fm=F3-0 * F7-4, 
Fm=F3-0 * F7-4, 


Fa=Fll-8 + FI 5-1 2 
Fa=Fll-8 - F15-12 
Fa=FLOAT Rll-8 by R15-12 
Fa=FIX Rll-8 by R15-12 
Fa=(Fll-8 + F15-12)/2 
Fa=ABS FI 1-8 
Fa=MAX (FI 1-8, FI 5-1 2) 

Fa=MIN (FI 1-8, FI 5-12) 

Fa=Fll-8 + FI 5-1 2, Fs=Fll-8 - F15-12 


Ra, Rm Any register file location (fixed-point) 

R3-0 R3, R2, Rl, RO 

R7-4 R7, R6, R5, R4 

Rll-8 Rll, RIO, R9, R8 

R15-12 R15, R14, R13, R12 

Fa, Fm Any register file location (floating-point) 

F3-0 F3, F2, FI, FO 

F7-4 F7, F6, F5, F4 

FI 1-8 FI 1, F10, F9, F8 
FI 5-1 2 FI 5, FI 4, F13, F12 

(SSF) X-input signed, Y-input signed, fractional inputs 

(SSFR) X-input signed, Y-input signed, fractional inputs, rounded output 

Table 7.7 Multifunction Instructions 
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No. 

Vector 

Function 

0 

0x00 

Reserved 

1* 

0x08 

Reset 

2 

0x10 

Reserved 

3 

0x18 

Status stack or loop stack overflow or PC stack full 

4 

0x20 

Timer=0 (high priority option) 

5 

0x28 

IRQ 3 asserted 

6 

0x30 

IRQ 2 asserted 

7 

0x38 

IRQ 1 asserted 

8 

0x40 

IRQ 0 asserted 

9 

0x48 

Reserved 

10 

0x50 

Reserved 

11 

0x58 

Circular buffer 7 overflow 

12 

0x60 

Circular buffer 15 overflow 

13 

0x68 

Reserved 

14 

0x70 

Timer=0 (low priority option) 

15 

0x78 

Fixed-point overflow 

16 

0x80 

Floating-point overflow 

17 

0x88 

Floating-point underflow 

18 

0x90 

Floating-point invalid operation 

19-23 

0x98-B8 

Reserved 

24-31 

0xC0-F8 

User software interrupts 


* Nonmaskable 


Table 7.8 Interrupt Vectors and Priority 
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7 Instruction Summary 

w 


I. Compute and Move or Modify 

1. compute, I DM(Ia, Mb) = dregl , I PM(Ic, Md) = dreg2 ; 

I dregl = DMfla, Mb) I dreg2 = PM(Ic, Md) 

2. IF condition compute ; 


3. a. IF condition 

b. IF condition 

c. IF condition 


compute, 

DM(Ia, Mb) 1 
PM(Ic, Md) 1 

= ureg; 

compute, 

1 DM(Mb, la) 1 

= ureg; 


1 rMUVLCi, 

ic; i 


compute, 

ureg = 

DM(Ia, Mb) 
PMdc, Md) 


d. IF condition 


compute , ureg = 


DM(Mb, la) 
PM(Md, Ic) 


4. a. IF condition 


compute , 


DM(Ia, <data6>) 
PM(Ic, <data6>) 


= dreg ; 


b. IF condition 


compute, 


DM(<data6>, la) 
PM(<data6>, Ic) 


= dreg ; 


c. IF condition 


compute, dreg = 


DM(Ia, <data6>) 
PM(Ic, <data6>) 


d. IF condition 


compute, 


dreg = 


DM(<data6>, la) 
PM(<data6>, Ic) 


5. IF condition compute, uregl = ureg2 ; 


6. a. IF condition shiftimm 


DM(Ia, Mb) 
PM(Ic, Md) 


= dreg ; 


b. IF condition shiftimm 


, dreg = 


DM(Ia, Mb) 
PMdc, Md) 


7. 


IF condition compute, 


MODIFY 


(la, Mb) 
(Ic, Md) 
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II. Program Flow Control 


8. 

IF condition 

JUMP 

CALL 


<addr24> 

(PC, <reladdr24>) 

( 

DB 

LA 

DB, LA 

9. 

IF condition 

JUMP 

CALL 


(Md, Ic) 

(PC, <reladdr6>) 

( 

DB 

LA 

DB, LA 


11. 

IF condition 

RTS 

( 

DB 



RTI 


LA 

DB, LA 


compute 


12. 

LCNTR = 

<datal6> 

, DO 

<addr24> 

UNTIL LCE ; 



ureg 


(<PC, reladdr24>) 



13. DO 


<addr24> 

(PC, <reladdr24>) 


UNTIL termination ; 


, compute 
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Immediate Move 


i | 

Iblf 1 1 1 1 1 I 



14. a. 


DM(<addr32>) 

PM(<addr24>) 


ureg ; 


b. ureg = 


DM(<addr32>) 

PM(<addr24>) 


15. a. 


DM(<data32>, la) 
PM(<data24>, Ic) 


ureg; 


b. 


DM(<data32>, la) 
PM(<data24>, Ic) 


16. 


DM(Ia, Mb) 
PM(Ic, Md) 


<data32> ; 


17. ureg = <data32> ; 
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IV. Miscellaneous 


18. 

BIT 

SET 

CLR 

TGL 

TST 

XOR 

sreg <data32> ; 

19. a. 

MODIFY 

(la, <data32>) ; 

(Ic, <data32>) 

b. 

BITREV 

(la, <data32>) ; 

20. 

PUSH 

POP 

LOOP , PUSH STS ; 

POP 

21. 

NOP; 



22. 

IDLE; 
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Assembly Programming 

Tutorial 


8.1 INTRODUCTION 

This tutorial is for first-time ADSP-21020/21010 assembly-language 
programmers. It explains basic techniques and conventions for good 
programming. The approach described here can be applied to 
ADSP-21020/21010 programming in general. 

The tutorial makes use of the ADSP-21000 Family Development Software 
(Assembler, Linker and Simulator programs). You must have this 
software to complete the parts of the tutorial that create the executable 
programs. 

This tutorial demonstrates ADSP-21020/21010 programming by example. 
It presents two completely working DSP systems based on the 
ADSP-21020, highlighting: 

• Architecture file 

• Assembler preprocessor directives 

• Calling subroutines in another file from the main program 

• General processor initialization procedure 

• Interrupt vector table placement and usage 

• Programming memory wait states and bank selects 

• Looping code, including: 

-Rolling a loop for more compact code 

• Multifunction instructions, focusing on: 

-Usage and restrictions 

• Simulating I/O ports 

• Using the interval timer 

• How to use interrupts, including: 

-Vector table code 
-Which registers are used 
-What to do after processor reset 
-Fast context switching 

• Special features of the system registers 

• Useable code examples 

• Two turnkey, complete system examples 

• An additional, efficient FFT code example 



Both examples filter digital data using infinite impulse response (HR) 
filters. The examples consist of a main calling shell and called subroutines. 
Technical information on the filters themselves is presented after the 
examples. 

The source files for the examples presented in this chapter can be obtained 
by: 


• Downloading the files from the DSP Applications Bulletin Board 
Service (see © page at the front of this manual for BBS contact 
information), 

• Purchasing the ADSP-21000 Family Development Software, which 
includes the example files, or 

• Contacting Analog Devices DSP Applications Engineering directly (see 
© page at the front of this manual for phone /FAX numbers). 

The first example, called "iirmem", (see Figure 8.1) reads a buffer of input 
values stored in memory, passes the data through the filter, and writes the 
results to another buffer in memory. The second example, called "iirirq", 
(see Figure 8.2, on page 8-4) reads input data via a memory-mapped I/O 
port and writes the filtered values to another port. The second program 
also serves to demonstrate how the interval timer can be used to generate 
processor interrupts at a desired sampling rate and how interrupts are 
handled on the ADSP-21020/21010. The software aspects of I/O port 
hardware are also described. 


8.2 EXAMPLE #1 : DATA IN MEMORY, NO INTERRUPTS 

The first example (program flowchart shown in Figure 8.1) reads a buffer 
of input values stored in memory, passes the data through the filter, and 
writes the results to another buffer in memory. Because no interrupts are 
used, execution progresses at the processor instruction rate until all input 
values have been filtered. After filtering the input data set, execution 
would normally continue, so in this example, the IDLE instruction is used 
to halt execution. The example also shows how the input data is initialized 
in memory. 



IIRMEM.ASM 

(MAIN PROGRAM) 


CASCADE.ASM 

(SUBROUTINES) 


INBUF 


OUTBUF 



Memory-resident input sample vector ( "INBUF" ) is filtered, storing the result 
vector in another data buffer ( "OUTBUF" ). No interrupts are used. Execution 
ceases when done. 


Figure 8.1 Program Flow for First Example 




IIRIRQ.ASM 

(MAIN PROGRAM) 


CASCADE.ASM 

(SUBROUTINES) 



INTERRUPT 


Sampled data is read from a memory-mapped I/O port during execution under interrupt 
control. Results are written out to another port. The DSP is interrupted at the sampling rate. 
The DSP's interval timer is the sampling clock. 


Figure 8.2 Program Flow for Second Example 




Programming Tutorial 8 

%# 


8.2.1 File Inventory 

Let's start by taking inventory of the files used in this system. Table 8.1 
enumerates the files and gives a brief description of their functions: 


Filename 

generic . ach 
iirmem. asm 
cascade . asm 
iircoef s . dat 
makefile .mem 


Function 

architecture description file 
main assembly program 

cascaded biquad filter subroutine (called by main program) 
filter coefficients 

MS-DOS "make" file used to create executable program 


Table 8.1 Files Used for Memory-Based (No Interrupts) Program 


8.2.2 Architecture Description File (generic.ach) 

Figure 8.3, on page 8-8, shows how actual external memory could be 
connected, in hardware, to the ADSP-21020 in this example. Notice that 
there are three physical memory banks, each 2048 (0x800) words in length. 
Figure 8.4 shows the total addressable program and data memory map of 
the ADSP-21020 and highlights the portions used in Figure 8.3. Listing 8.1, 
on the following page, contains the description of the architecture found 
in Figure 8.3. This file, called the architecture description file , is used by both 
the linker and the simulator during the code development process. The 
architecture description file guides the linker in placing code and data in 
the ADSP-21020 memory map. This file is also used by the simulator in 
order to simulate not only the processor itself, but also the memory 
connected to it. 

If you are familiar with the ADSP-2100 Family (16-bit fixed-point DSP) 
development tools, you may be surprised that the ADSP-21000 Family 
Development Software has no System Builder program. Instead, the 
ADSP-21020/ 21 010 software tools read the .ach text file directly for the 
necessary information. In other words, you create the .ach file with a text 
editor and that's it — no other actions are needed. 
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.SYSTEM 

.PROCESSOR 


generic; 

ADSP21020; 


directive 

memory 

type 

start address 

end address 

memory 

space 

segment 

name 

SEGMENT 

/RAM 

/BEGIN=0x0 00000 

/END=0x000007 

/PM 


resrvdO ; 

SEGMENT 

/RAM 

/BEGIN=0x000008 

/END=0x00000F 

/PM 


rst svc; 

SEGMENT 

/RAM 

/BEGIN=0x000010 

/END=0x000017 

/PM 


resrvdl ; 

SEGMENT 

/RAM 

/BEGIN=0x000018 

/END=0x00001F 

/PM 


sovf svc; 

SEGMENT 

/RAM 

/BEGIN=0x000020 

/END=0x000027 

/PM 


tmzh svc; 

SEGMENT 

/RAM 

/BEGIN=0x000028 

/END=0x00002F 

/PM 


irq3 svc; 

SEGMENT 

/RAM 

/BEGIN=0x000030 

/END=0x000037 

/PM 


irq2 svc; 

SEGMENT 

/RAM 

/BEGIN=0xQ00038 

/END=0x00003F 

/PM 


n rrrl . 

v w 

SEGMENT 

/RAM 

/BEGIN=0x000040 

/END=0x000047 

/PM 


irqO svc; 

SEGMENT 

/RAM 

/BEGIN=0x000048 

/END=0x00004F 

/PM 


resrvd2 ; 

SEGMENT 

/RAM 

/BEGIN=0x000050 

/END=0x000057 

/PM 


resrvd3 ; 

SEGMENT 

/RAM 

/BEGIN=0x000058 

/END=0x00005F 

/PM 


cb7 svc; 

SEGMENT 

/RAM 

/BEGIN=0x000060 

/END=0x0000 67 

/PM 


cbl5 svc; 

SEGMENT 

/RAM 

/BEGIN=0x000068 

/END=0x00006F 

/PM 


resrvd4 ; 

SEGMENT 

/RAM 

/BEGIN=0x000070 

/END=0x000077 

/PM 


tmzl svc; 

SEGMENT 

/RAM 

/BEGIN=0x000078 

/END=0x00007F 

/PM 


fix svc; 

SEGMENT 

/RAM 

/BEGIN=0x000080 

/END=0x000087 

/PM 


flto svc; 

SEGMENT 

/RAM 

/BEGIN=0x000088 

/END=0x00008F 

/PM 


fltu svc; 

SEGMENT 

/RAM 

/BEGIN=0x000090 

/END=0x0000 97 

/PM 


flti svc; 

SEGMENT 

/RAM 

/BEGIN=0x000098 

/END=0x00009F 

/PM 


resrvd5 ; 

SEGMENT 

/RAM 

/BEGIN=0x0000A0 

/END=0x0000A7 

/PM 


resrvd6 ; 

SEGMENT 

/RAM 

/BEGIN=0x0000A8 

/END=0x0000AF 

/PM 


resrvd7 ; 

SEGMENT 

/RAM 

/BEGIN=0x0000B0 

/END=0x0000B7 

/PM 


resrvd8 ; 

SEGMENT 

/RAM 

/BEGIN=0x0000B8 

/END=0x0000BF 

/PM 


resrvd9 ; 

SEGMENT 

/RAM 

/BEGIN=0x0000C0 

/END=0x0000C7 

/PM 


sftO svc; 

SEGMENT 

/RAM 

/BEGIN=0x0000C8 

/END=0x0000CF 

/PM 


sftl svc; 

SEGMENT 

/RAM 

/BEGIN=0x0000D0 

/END=0x0000D7 

/PM 


sft2 svc; 

SEGMENT 

/RAM 

/BEGIN=0x0000D8 

/END=0x0000DF 

/PM 


sft3 svc; 

SEGMENT 

/RAM 

/BEGIN=0x0000E0 

/END=0x0000E7 

/PM 


sft4 svc; 

SEGMENT 

/RAM 

/BEGIN=0x0000E8 

/END=0x0000EF 

/PM 


sft5 svc; 

SEGMENT 

/RAM 

/BEGIN=0x0000F0 

/END=0x0000F7 

/PM 


sft6 svc; 

SEGMENT 

/RAM 

/BEGIN=0x0000F8 

/END=0x0000FF 

/PM 


sft7 svc; 

SEGMENT 

/RAM 

/BEGIN=0x000100 

/END=0x0007FF 

/PM 


pm_code; 

SEGMENT 

/RAM 

/BEGIN=0x000800 

/END=0x000FFF 

/PM 


pm_data; 

SEGMENT 

/RAM 

/BEGIN=0x000000 

/END=0x0007FF 

/DM 

dm dat< 

SEGMENT 

/PORT 

/BEGIN=0xF0000000 /END=0xF00000lF 

/DM 

ports ; 


. END SYS; 

Listing 8.1 generic.ach 




The .system and .endsys directives indicate the start and end of the 
architecture description. Although the segment names are arbitrary, they 
are chosen here to be self-documenting. 

The first of the three physical 2K-word memory blocks is divided into 33 
segments. The first 32 segments are mapped to the 256 (0x100) locations 
reserved for the interrupt vector table, and the other segment named 
pm_code contains the balance (0x700) for general instruction code storage. 
This memory is connected to the program memory interface. 

The second of the three physical 2K-word memory blocks is set aside for 
data storage in program memory space. This block only contains one 
segment named pm_data. This memory is also connected to the program 
memory interface. See the following section for a description of how the 
processor differentiates between this program memory block and the 
previous one. 

The third of the three physical 2K-word memory blocks is general- 
purpose data memory. This block only contains one segment named 
dm_data. This memory is connected to the data memory interface. 

8.2.3 External vs. Internal Address Decoding 

Figure 8.3, on the following page, shows how external hardware address 
decoding logic arbitrates to select which program memory block drives 
the program memory interface at a given time. The block titled "address 
decode" is an address comparator. Depending on the value on the PMA 
bus, the address decoder enables either one memory block or the other. 

To avoid the need for this external logic, the ADSP-21020 incorporates 
internal address decoding logic. Both the program memory and the data 
memory spaces are divided into several banks, each of which has it own 
select line (PMS0 and PMSl for program memory, DMS0, DMSl, DM52 
and DM53 for data memory). When accessing memory, only the 
appropriate select line becomes active, depending on the address value. 
The boundary addresses are stored in special registers on the ADSP-21020 
(PMBANK1, and DMBANK1, 2, 3). Example #2 "iirirq" makes use of these 
bank selects. Compare Figure 8.4 for example #1 and Figure 8.6 for 
example #2. 
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Figure 8.3 Physical Memory Architecture Described in “generic.ach” 

8.2.4 Specifying The .ACH File 

The architecture file can be specified to the linker (ld21k) and the 
simulator (sim21k) with the -a <filename> switch at invocation. For 
example: 


ld21k iirmem cascade -a generic -m 


runs the linker using "generic.ach" as the architecture file, and 


sim21k -e iirmem -a generic 


invokes the simulator loading "generic.ach" as the architecture file. 

Although the filename extension ".ach" is explicitly specified in this 
example, the linker and simulator assume this extension by default and it 
need not be typed. 

8.2.5 Main Program (iirmem.asm) 

The main assembly program, called "iirmem.asm," is shown in Listing 8.2. 
This program performs the ADSP-21 020's initial setups after reset, then 
proceeds into the main processing loop. 
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Program Memory 


Data Memory 



0x00 0000 
0x00 0008 
0x00 0010 

0x00 0100 
0x00 0800 

0x00 1000 


0x80 0000 

(PMBANK1 
default address) 


OxFF FFFF 


= accounted for in architecture description file 
= address space unused in this example 


> 

s 

s 

\ 


"rst_svc" 

s 


"pm_code" 

"pm_data " 




0x0000 0000 
0x0000 0800 


0x4000 0000 

(DMBANK1 
default address) 


0x8000 0000 

(DMBANK2 
default address) 


OxCOOO 0000 

(DMBANK3 
default address) 


OxFFFF FFFF 


"dm_data" 




addresses associated with interrupt vector table 


Figure 8.4 Memory Map Described in “generic.ach” 



The typical initial setups include: 

• Disabling interrupts, (default at processor reset) 

• Initializing the interrupt vector table, 

• Altering the values in the memory hardware configuration registers: 
DMBANK1, 2, 3, PMBANK1, DMWAIT, PMWA1T 

• Initializing address and data registers, 

• Initializing memory locations and buffers, 

• Configuring and initializing on-chip peripherals such as the timer, 

• Configuring interrupts, and 

• Enabling interrupts (done last). 
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The main processing loop is either: 

• A list of tasks which eventually terminates, possibly with interrupts, or 

• An endless loop, most often interrupted with interrupt-driven tasks. 


.EXTERN cascaded_biquad, cascaded_biquad_init; 

.GLOBAL coefs, dline; 

,PRECISION=40; 

. ROUND_NEAREST ; 

#def ine SAMPLES 300 
#def ine SECTIONS 3 

.SEGMENT /'DM dm_data; 

.VAR inbuf [SAMPLES] =1.0, 0.0; 

. VAR outbuf .[ SAMPLES ] ; 

.VAR dline [ SECTIONS*2 ] ; 

. ENDS EG; 

.SEGMENT /PM pm_data; 

.VAR coefs [SECTIONS*4] ="iircoefs . dat"; { al2, all,bl2,bll, a22, a21, . . . } 
. ENDSEG; 

.SEGMENT /PM rst_svc; 

jump begin; 

.ENDSEG; 

.SEGMENT /PM pm_code; 

initial setups : 

begin; pmwait=0x0021; 

dmwait=0x8421; 
b3=inbuf; 13=0; 
b4=outbuf; 14=0; 

10=0; 11=0; 18=0; 
ml=l; m8=l; 

call cascaded_biquad_init (db) ; 
rO=SECTIONS; 
b0=dline; 

main processing loop : 

lent r=SAMPLES , do filtering until Ice; 
f8=dm(i3, 1) ; 

call cascaded_biquad (db) ; { input=F8, output =F8 } 

b0=dline; 
b8=coefs; 

filtering: dm(i4,l)=f8; 
done : idle ; 

.ENDSEG; 

Listing 8.2 iirmem.asm 


{ zero wait states for all of PM } 
{ zero wait states for all of DM } 


{ zero the delay line } 


{ input = unit impulse } 

{ ends up holding impulse response } 
{ w", w\ NEXT w", NEXT w\ ... } 
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8.2.5 . t Initial Setups: Initialization Following Reset 

This section describes each of the initial setups. 

Disabling interrupts. Resetting the processor automatically disables 
interrupts by setting the IMASK register to zero. IRPTL, the interrupt latch 
register, is not affected by reset and therefore should be cleared before 
enabling any interrupts. However, no interrupts are ever enabled in this 
example. Therefore you do not need to do anything with these registers in 
this case. 

Initializing the interrupt vector table. At the beginning of instruction 
memory space is the non-relocatable interrupt vector table as required by 
the ADSP-21020 (see Figure 8.4). For any given interrupt signal, there is a 
predetermined instruction address (called the interrupt vector) which is 
branched to when the interrupt occurs. The instruction at this address is 
often a jump to an interrupt service routine which resides outside the 
interrupt vector table (see Listing 8.2). 

Interrupt vector table addresses are spaced by eight locations. This allows 
eight cycles of code to be executed for a given interrupt without a branch 
and the associated overhead, providing quick interrupt servicing for short 
routines. The only requirement is to terminate the instruction sequence 
with a return from interrupt (RTI) instruction. 

Interrupt service code within the interrupt vector table which extends 
beyond the allotted eight locations is not advised. For example, it is 
possible to extend into the space reserved for the next interrupt if that 
interrupt is not being used, but this is a poor programming practice. 

In addition, certain interrupt vector entries should not be used at all. The 
first vector location PM [0x00] is reserved and should never be used by the 
programmer. The remaining vectors labeled "reserved" could be used for 
code, but this would be a poor programming practice. If those vector 
locations are utilized by future members of the ADSP-21000 family, using 
those locations on the ADSP-21020 today may result in code 
incompatibility in the future. 

By sectioning the interrupt table space (PM[0] - PM[0xFF]) into 32 
segments of 8 locations each, both of these poor programming practices 
are effectively discouraged. 
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This particular vector table simply causes the processor to branch after 
reset to the beginning of the executable code, at the label begin , which 
immediately follows the end of the vector table (PM[0xl00]). In this 
example, the only interrupt vector table instructions are stored in the 
"rst_svc" segment: 


.SEGMENT /PM 
. ENDSEG; 


rst_svc; 
jump begin; 


All other vectors are left undefined. 

Example #2 ("iirirq") shows how other interrupts are incorporated into 
the vector table. 

Altering the values in the memory hardware configuration registers. 

The first instructions executed after reset alter the ADSP-21020 memory 
configuration registers, the DMWAIT and PMWAIT registers. These wait 
state configuration registers contain default values at reset. In this 
example, the registers are altered with values that set all software wait 
states to zero. The wait state mode is set to software-programmed wait 
only (i.e., the DMACK and PMACK hardware inputs are not used). 

begin: pmwait=0x0021 ; { zero wait states for all of PM } 

dmwait=0x8421 ; { zero wait states for all of DM } 


The programmable memory bank boundaries are defined with 
PMBANK1, DMBANK1, DMBANK2 and DMBANK3 registers. These 
memory bank registers contain default address values at reset (see Figure 
8.4). Since bank 0 of both program and data memory always starts at 
address zero, there are no PMBANKO or DMBANKO registers. In this 
example, the bank default addresses are left unchanged. Example #2 
shows a more extensive memory configuration procedure. 

Initializing address and data registers. Registers which reside in the data 
address generators (DAGs) are used for addressing purposes. These 
registers may be initialized during the initial setup period, especially in 
short code examples such as this one, where DAG registers serve 
unchanging functions. The instructions that initialize these registers are: 


13=0; 

14=0; 

11 = 0 ; 

ml=l; 
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b3=inbuf ; 
b4=outbuf ; 
10 = 0 ; 


18=0; 
m8=l ; 
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Notice that setting a B (base) register automatically also sets the associated 
I (index) register to the same value. Although the B register is only 
necessary in circular buffering applications, during initial setup it is a 
good practice to set the B register instead of the I register even if the buffer 
is not circular. That way, if you later decides to make the buffer a circular 
buffer by setting the associated L (length) register to a certain value, there 
will be no problems with having not initialized the B register. 

It is especially important to preset the length (L) registers. Neglecting to 
set the L registers may cause circular buffer addressing when nonzero 
data appears in the L registers upon powerup. Where circular addressing 
is NOT desired, set the appropriate L registers to zero. Where circular 
addressing IS desired, set the appropriate L registers to the circular buffer 
length. 

The initialization of one of the B registers (BO) occurs in the program after 
a delayed call instruction but it is actually executed before the call, with 
the other initializations, since the call is delayed. This is shown in the next 
section. 

Initializing memory locations and buffers. The delay line storage 
memory for the HR filter is cleared to zero by calling the subroutine 
cascaded _biquad_init. This memory initialization code resides in another 
file called "cascade. asm." Notice that delayed branching is used for 
execution efficiency. 

call cascaded__biquad_init (db) ; { zero the delay line } 

rO=SECTIONS; { executed on the way } 

bO=dline; { into the subroutine } 



Configuring and initializing on-chip peripherals. No on-chip 
peripherals are used. No setups are required in this case. 

Configuring and enabling interrupts. Since interrupts are not used in this 
example, they are neither configured nor enabled. Upon reset, the default 
register values are such that if you are not planning on using interrupts, 
you do not have to worry about interrupt configuration at all. 
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82.52 Main Processing Loop 

Once the initial setups are complete, the filtering operation can begin. 
Three hundred input samples reside in an input data buffer, and the 
program must process each one via the filtering operation and store the 
results in the output data buffer. The 300 iterations are automatically 
managed in hardware by a zero-overhead DO UNTIL loop construct. The 
loop is set up in conjunction with assigning the value 300 to the loop 
counter register (LCNTR). Unlike in the ADSP-2100 family, these two 
operations can be done in the same instruction cycle. For example: 

#def ine SAMPLES 300 

Icntr=SAMPLES, do filtering until Ice; 

Five instructions are executed within the loop. First, an input data value is 
read from the input buffer. Then the cascaded Jbiquad subroutine is called. 
The subroutine code resides in a separate file called "cascade.asm." 
Because the call is a delayed branch, the two instructions that follow the 
call instruction actually get executed before the call. This is indicated here 
by indenting these two instructions. The next instruction after returning 
from the subroutine stores the returned value from the subroutine in the 
output buffer. For example: 

f 8=dm ( i3 , 1 ) ; 

call cascaded_biquad (db) ; { input =F8, output=F8 } 

b0=dline; 
b8=coef s ; 

filtering: dm (i4 , 1 ) =f 8; 

Upon termination of the loop, the ADSP-21020 continues by executing 
subsequent code. In this example, that code is an IDLE instruction, which 
halts the processor. When using the simulator, you could set a breakpoint 
at the instruction labeled done to halt execution. A breakpoint will not only 
halt simulation when reached, but it will also display a message. For 
example: 

done: idle; { set breakpoint here! } 


8-14 




8.2.6 Creating The Executable Program 

The executable program for this example is created using the following 
commands to invoke the ADSP-21000 Family Assembler and Linker: 


asm21k iirmem 
asm21k cascade 

ld21k iirmem cascade -a generic -m 

The memory map file, created by linking the files using the -m switch, 
shows how the linker loads the interrupt vector instructions, the main 
program, the called subroutines, and all the associated data spaces into the 
memory segments defined by the architecture file in Listing 8.1. 

8.2.7 Simulation 

This example system can be simulated by entering the following 
command to invoke the ADSP-21020/21010 Simulator: 


sim21k -e iirmem -a generic 


Look at the input buffer and the output buffer. Notice that the output 
buffer does not contain the results yet. Set a breakpoint at the program 
memory location labelled done. Allow the simulator to run (execute the 
processor code). It will halt upon fetching the breakpoint instruction. Look 
at the output buffer, which should now contain the result values. These 
results may be dumped into a file using the memory dump command. 
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8.3 EXAMPLE #2— INTERRUPT-DRIVEN, WITH PORT I/O 

This second example (program flowchart shown in Figure 8.2) reads one 
input value from a memory-mapped I/O port, passes the data through the 
filter, and writes the result value to another port. This transaction occurs 
whenever requested by an interrupt signal, in this case the timer interrupt. 
This example uses the interval timer to generate periodic interrupts. 
Specific topics which will be explored here are: 

• programming the interrupt vector table 

• setting up the ADSP-21020 to react to interrupts and configuring how 
they are used 

• using memory-mapped I/O ports 

• simulating operation with the software development tools 

• benchmarking the execution efficiency of interrupt-driven code 

• interrupt debugging hints 

Some information relevant to this example is presented in example #1 
(iirmem). Instead of duplicating this information, example #2 highlights 
new or different information only. 

8.3.1 File Inventory 

Table 8.2 lists the files and gives a brief description of their functions: 


Filename 

iirirq. ach 
def 21020 . h 
iirirq. asm 
cascade . asm 
iircoef s . dat 
makefile . irq 
input . dat 
out 300 .dat 


Function 

architecture description file 

bit position definitions for ADSP-21020 system registers 
main assembly program 
cascaded biquad filter subroutine 
filter coefficients 

MS-DOS "make" file used to create executable program 
example input file for I/O port simulation 
example output file from I/O port simulation 


Table 8.2 Files Used for Interrupt-Driven Program using Port I/O 


8.3.2 Architecture Description File (iirirq.ach) 

Figure 8.5 shows how actual external memory could be connected, in 
hardware, to the ADSP-21020 in this example. Notice three physical RAM 
memory banks, each 2048 (0x800) words in length, in conjunction with 
two memory-mapped I/O ports. Figure 8.6 shows the total addressable 
program and data memory maps of the ADSP-21020 and highlights which 
portions are used by the system in Figure 8.5. Listing 8.3 contains the 
description of the architecture found in Figure 8.5. 



. SYSTEM IIRIRQ_example_arch_f ile; 

.PROCESSOR = ADSP21020; 


directive 

memory 

type 

start address 

end address 

memory 

space 

segment 

name 

. SEGMENT 

/RAM 

/BEGIN=0x000008 

/END= 0x0000 OF 

/PM 

rst svc; 

. SEGMENT 

/RAM 

/BEGIN=0x000020 

/END=0x000027 

/PM 

tmzh svc 

. SEGMENT 

/RAM 

/BEGIN=0x000100 

/END=0x0007FF 

/PM 

pm code; 

. SEGMENT 

/RAM 

/BEGIN=0x000800 

/END=0x000FFF 

/PM 

pm bankl 

. SEGMENT 

/RAM 

/BEGIN=OxOOOOOOOO 

/END=0x00000FFF 

/DM 

dm bankO 

. SEGMENT 

/PORT 

/ BEG IN=0x 00001000 

/END=0x00001000 

/DM 

dm bankl 

. SEGMENT 

/PORT 

/BEGIN=Ox00002000 

/END=0x0 0002 000 

/DM 

dm bank2 

.ENDSYS; 

Listing 8.3 

iirirq.ach 






Figure 8.5 Physical Memory Architecture Described in “iirirq.ach” 
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Program Memory 


Data Memory 


0x00 0000 
0x00 0008 
0x00 0010 
0x00 0020 
0x00 0028 
0x00 0100 


0x00 0800 

(PMBANK1 
new address) 

0x00 1000 



OxFF FFFF 


0x0000 0000 


(DMBANK1 0x0000 1000 

new address) 

0x0000 1001 

(DMBANK2 0x00002000 

new address) 

0x0000 2001 


(all addresses are in hex) 


OxCOOO 0000 

(DMBANK3 
default address) 


I I = accounted for in architecture description file OxFFFF FFFF 

| | = address space unused in this example 

= addresses associated with interrupt vector table 

Figure 8.6 Memory Map Described in “iirirq.ach” 
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8.3.3 Main Program (iirirq.asm) 

The main assembly program, called "iirirq.asm," is shown in Listing 8.4. 
Following the typical structure of any ADSP-21020 program as shown in 
example #1, this program also performs the ADSP-21 020's initial setups 
after reset, then proceeds into the main processing loop. Because 
interrupts, the interval timer, and memory bank selects will be used, there 
are more setup tasks to be done in this example than in example #1 . 


#include "def 21020. h" { bit definitions } 

#def ine SAMPLES 300 
#def ine SECTIONS 3 

.EXTERN cascaded _biquad, cascaded_biquad_init ; 

.GLOBAL coefs, dline; 

.PRECISION = 40; 

.ROUND NEAREST; 


.SEGMENT /DM dmJoankO; 

.VAR dline [SECTIONS *2 ] ; 

. ENDS EG; 

.SEGMENT /DM dm_bankl; 

.VAR in_channel; 

. ENDSEG; 

..SEGMENT /DM dm_bank2 ; 

.VAR out_channel; 

.ENDSEG; 

.SEGMENT /PM pm_bankl; 

.VAR coefs [SECTIONS*4] ="iircoefs . dat"; 

.ENDSEG; 

.SEGMENT /PM rst_svc; 

jump begin; 

.ENDSEG; 

.SEGMENT /PM tmzh_svc; 

jump new_sample; 

.ENDSEG; 

.SEGMENT /PM pm_code; 


{ selected by DMS0~ } 

{ filter delay line: } 

{ w ", w\ NEXT w", NEXT w', ... } 

{ selected by DMS1- } 

{ memory-mapped I/O port } 

{ selected by DMS2~ } 

{ memory-mapped I/O port } 

{ selected by PMS1~ } 

{ al2,all,bl2,bll,a22,a21, . . . } 

{ processor RESET service } 

{ timer interrupt service } 

{ selected by PMS0- } 

(listing continues on next page) 





initial setups: 

begin: pmwait = 0x0021; 

dmwait = 0xc401; 

pmbankl=0x000800; 
dmbankl=0x0 0001000; 
dmbank2=0x0 0002 000; 

10=0; 11=0; 18=0; 
ml=l; m8=l; 

call cascaded_biquad_init (db) ; 

rO=SECTIONS; 
b0=dline; 
tperiod=199; 
tcount=199; 
bit set imask TMZHI; 
bit set mode2 TIMEN; 
bit set model IRPTEN; 

main processing loop: 

wait: idle; 

jump wait; 

interrupt service routine which does the filtering: 

new_s ample : 

f 8=dm ( in__channel ) ; 
call cascaded_biquad (db) ; 
b0=dline; 
b8=coefs; 
rti (db); 

dm ( out_channel ) =f 8 ; 
nop; 

. ENDS EG; 


{ RAM = zero wait states } 

{ RAM = zero wait states 
in_channel = ext . hardware ACK 
out_channel = 5 automatic waits } 

{ first addr of pm bank 1 (PMS1~) } 

{ first addr of dm bank 1 (DMS1~) } 

{ first addr of dm bank 2 (DMS2~) } 


{ zero the delay line } 


{ 100 kHz sampling if CLKIN=20.0 MHz} 

{ allow timer interrupt } 

{ turn on timer } 

{ allow interrupts } 

{ wait for interrupts indefinitely } 

{ after rti, go wait for more } 


{ simulate this channel with a file } 
{ input =F 8, output =F8 } 


{ simulate this channel with file } 


Listing 8.4 iirirq.asm 



{ 

def21020.h - STATUS REGISTER BIT DEFINITIONS FOR ADSP-21020 


This include file contains a list of "defines" to enable the programmer to 
use symbolic names for all of the system register bits for the ADSP-21020. 
} 


{ M0DE1 register } 


#def ine 

BRO 

0x00000002 

{ Bit 

1: 

Bit-reverse for 10 (uses DMS0- only ) 

#def ine 

SRCU 

0x00000004 

{ Bit 

2: 

Alt. register select for comp, units 

#def ine 

SRDlH 

0x00000008 

{ Bit 

3: 

DAG1 alt. 

register select 

(7-4) 

#def ine 

SRD1L 

0x00000010 

{ Bit 

4: 

DAG1 alt. 

register select 

(3-0) 

#def ine 

SRD2H 

0x00000020 

{ Bit 

5: 

DAG2 alt. 

register select 

(15-12) 

#def ine 

SRD2L 

0x00000040 

{ Bit 

6: 

DAG2 alt. 

register select 

(11-8) 

#def ine 

SRRFH 

0x00000080 

( Bit 

7: 

Register 

file alt. select 

for R(15-8) 

#def ine 

SRRFL 

0x00000400 

{ Bit 

10: 

Register 

file alt. select 

for R (7-0) 

#def ine 

NESTM 

0x00000800 

( Bit 

11: 

Interrupt 

nesting enable 


#def ine 

IRPTEN 

0x00001000 

{ Bit 

12: 

Global interrupt enable 


#def ine 

ALUSAT 

0x00002000 

( Bit 

13: 

Enable ALU fixed-pt. saturation 

#def ine 

TRUNC 

0x00008000 

{ Bit 

15: 

l=f ltg-pt 

. truncation 0=Rnd to nearest 

#define 

RND32 

0x00010000 

{ Bit 

16: 

l=32-bit 

f ltg-pt . rounding 

0=40-bit rnd 


{ MODE 2 register } 


#def ine 

IRQ0E 

0x00000001 

( Bit 

0: 

IRQ0- 

l=edge sens . 0=level 

sens 

#def ine 

IRQ1E 

0x00000002 

( Bit 

1: 

IRQ1- 

l=edge sens . 0=level 

sens 

#define 

IRQ2E 

0x00000004 

( Bit 

2: 

IRQ2- 

l=edge sens . 0=level 

sens 

#def ine 

IRQ3E 

0x00000008 

( Bit 

3: 

IRQ3- 

l=edge sens . 0=level 

sens 

#def ine 

CADIS 

0x00000010 

( Bit 

4: 

Cache 

disable 



#define 

TIMEN 

0x00000020 

[ Bit 

5: 

Timer 

enable 



#def ine 

FLGOO 

0x00008000 

( Bit 

15: 

FLAG0 

l=output 

0=input 


#def ine 

FLG10 

0x00010000 

( Bit 

16: 

FLAG1 

l=output 

0=input 


#def ine 

FLG20 

0x00020000 

( Bit 

17: 

FLAG2 

l=output 

0=input 


#def ine 

FLG30 

0x00040000 

( Bit 

18: 

FLAG3 

l=output 

0=input 


#def ine 

CAFRZ 

0x00080000 

( Bit 

19: 

Cache 

freeze 




{ ASTAT register } 


#define 

AZ 

0x00000001 

( Bit 

0: 

ALU result zero or f ltg-pt. underflow 

#def ine 

AV 

0x00000002 

{ Bit 

1: 

ALU overflow 

#def ine 

AN 

0x00000004 

( Bit 

2: 

ALU result negative 

#define 

AC 

0x00000008 

{ Bit 

3: 

ALU fixed-pt. carry 

#def ine 

AS 

0x00000010 

( Bit 

4: 

ALU X input sign (ABS and MANT ops) 

#define 

AI 

0x00000020 

{ Bit 

5: 

ALU f ltg-pt. invalid operation 

#def ine 

MN 

0x00000040 

{ Bit 

6: 

Multiplier result negative 

tdefine 

MV 

0x00000080 

( Bit 

7: 

Multiplier overflow 

#define 

MU 

0x00000100 

{ Bit 

8: 

Multiplier f ltg-pt. underflow 

#define 

MI 

0x00000200 

( Bit 

9: 

Multiplier f ltg-pt. invalid operation 

#def ine 

AF 

0x00000400 

{ Bit 

10: 

ALU f ltg-pt . operation 

#def ine 

SV 

0x00000800 

{ Bit 

11: 

Shifter overflow 

#define 

sz 

0x00001000 

{ Bit 

12: 

Shifter result zero 

#define 

ss 

0x00002000 

{ Bit 

13: 

Shifter input sign 

#define 

BTF 

0x00040000 

{ Bit 

18: 

Bit test flag for system registers 

tdefine 

FLG0 

0x00080000 

( Bit 

19: 

FLAG0 value 

tdefine 

FLG1 

0x00100000 

( Bit 

20: 

FLAG1 value 


(listing continues on next page) 


#define 

FLG2 

0x00200000 ■ 

( Bit 

21: 

FLAG2 value 

#define 

FLG3 

0x00400000 • 

( Bit 

22: 

FLAG 3 value 

#define 

CACCO 

0x01000000 • 

( Bit 

24: 

Compare Accumulation Bit 0 

#define 

CACC1 

0x02000000 i 

( Bit 

25: 

Compare Accumulation Bit 1 

#define 

CACC2 

0x04000000 i 

[ Bit 

26: 

Compare Accumulation Bit 2 

#define 

CACC3 

0x08000000 ■ 

( Bit 

27: 

Compare Accumulation Bit 3 

#define 

CACC4 

0x10000000 ■ 

( Bit 

28: 

Compare Accumulation Bit 4 

#define 

CACC5 

0x20000000 - 

( Bit 

29: 

Compare Accumulation Bit 5 

#define 

CACC6 

0x40000000 '{ Bit 

30: 

Compare Accumulation Bit 6 

#define 

CACC7 

0x80000000 ■ 

( Bit 

31: 

Compare Accumulation Bit 7 

{ STKY register 
#define AUS 

} 

0x00000001 

( Bit 

0: 

ALU fltg-pt. underflow 

#def ine 

AVS 

0x00000002 

( Bit 

1: 

ALU fltg-pt. overflow 

#define 

AOS 

0x00000004 

[ Bit 

2: 

ALU fixed-pt. overflow 

#define 

AIS 

0x00000020 

{ Bit 

5: 

ALU fltg-pt. invalid operation 

#def ine 

MOS 

0x00000040 

{ Bit 

6: 

Multiplier fixed-pt. overflow 

#def ine 

MVS 

0x00000080 

{ Bit 

7: 

Multiplier fltg-pt. overflow 

#def ine 

MUS 

0x00000100 

{ Bit 

8: 

Multiplier fltg-pt. underflow 

#def ine 

MIS 

0x00000200 

{ Bit 

9: 

Multiplier fltg-pt. invalid operation 

#def ine 

CB7S 

0x00020000 

{ Bit 

17: 

DAG1 circular buffer 7 overflow 

#define 

CB15S 

0x00040000 

{ Bit 

18: 

DAG2 circular buffer 15 overflow 

#define 

PCFL 

0x00200000 

{ Bit 

21: 

PC stack full 

#define 

PCEM 

0x00400000 

{ Bit 

22: 

PC stack empty 

#define 

SSOV 

0x00800000 

{ Bit 

23: 

Status stack overflow (MODE1 and ASTAT) 

#def ine 

SSEM 

0x01000000 

{ Bit 

24: 

Status stack empty 

#def ine 

LSOV 

0x02000000 

{ Bit 

25: 

Loop stack overflow 

#def ine 

LSEM 

0x04000000 

{ Bit 

26: 

Loop stack empty 


{ IRPTL 

and IMASK and IMASKP 

registers } 




#def ine 

RSTI 

0x00000002 { 

Bit 

1: 

Address : 

08: 

Reset 

} 

#def ine 

SOVFI 

0x00000008 { 

Bit 

3: 

Address : 

18: 

Stack overflow 

} 

#def ine 

TMZHI 

0x00000010 { 

Bit 

4 : 

Address : 

20: 

Timer = 0 (high priority) } 

#def ine 

IRQ3I 

0x00000020 { 

Bit 

5: 

Address : 

28: 

IRQ3- asserted 

} 

#def ine 

IRQ2I 

0x00000040 { 

Bit 

6: 

Address : 

30: 

IRQ2- asserted 

} 

#def ine 

IRQ1I 

0x00000080 { 

Bit 

7: 

Address : 

38: 

IRQ1- asserted 

} 

#define 

IRQ0I 

0x00000100 { 

Bit 

8: 

Address : 

40: 

IRQ0- asserted 

} 

#define 

CB7I 

0x00000800 { 

Bit 

11: 

Address : 

58: 

Circ. buffer 7 overflow } 

#define 

CB15I 

0x00001000 { 

Bit 

12: 

Address : 

60: 

Circ. buffer 15 overflow } 

#define 

TMZLI 

0x00004000 { 

Bit 

14: 

Address : 

70: 

Timer = 0 (low priority) } 

#define 

FIXI 

0x00008000 { 

Bit 

15: 

Address : 

78: 

Fixed-pt. overflow } 

#define 

FLTOI 

0x00010000 { 

Bit 

16: 

Address : 

80: 

fltg-pt. overflow 

} 

#define 

FLTUI 

0x00020000 { 

Bit 

17: 

Address: 

88: 

fltg-pt. underflow ■} 

#define 

FLTII 

0x00040000 { 

Bit 

18: 

Address : 

90: 

fltg-pt. invalid 

} 

#define 

SFT0I 

0x01000000 { 

Bit 

24: 

Address : 

CO: 

user software int 

0 } 

#def ine 

SFT1I 

0x02000000 { 

Bit 

25: 

Address : 

C8 : 

user software int 

1 } 

#define 

SFT2I 

0x04000000 { 

Bit 

26: 

Address : 

DO: 

user software int 

2 } 

#define 

SFT3I 

0x08000000 { 

Bit 

27: 

Address : 

D8 : 

user software int 

3 } 

#def ine 

SFT4I 

0x10000000 { 

Bit 

28: 

Address : 

E0: 

user software int 

4 } 

#def ine 

SFT5I 

0x20000000 { 

Bit 

29: 

Address : 

E8: 

user software int 

5 } 

#def ine 

SFT6I 

0x40000000 { 

Bit 

30: 

Address : 

F0: 

user software int 

6 } 

#def ine 

SFT7I 

0x80000000 { 

Bit 

31: 

Address : 

F8 : 

user software int 

7 } 


Listing 8.5 def21020.h 



8.3.3. 1 Initialization Following Reset (Initial Setups) 

This section describes each of the initial setups. 

Disabling interrupts. A processor reset automatically clears the IMASK 
register, effectively blocking any interrupts from interfering with 
instruction execution. There is nothing the programmer is required to do 
for this step. 

Initializing the interrupt vector table. As in example #1, the beginning of 
program memory is used to store the interrupt vector table. Part of the 
initialized vector table can be found in Listing 8.4. 

Altering the values in the memory hardware configuration registers. 

Memory wait states are configured by the PM WAIT and DM WAIT 
registers. Upon processor reset, these registers contain these default 
values: 

PM WAIT: 0x0000 03DE 
DMWAIT: OxOOOF 7BDE 

In the example system in Figure 8.5, all RAM can operate with zero wait 
states in both program memory and data memory spaces. The two I/O 
channels mapped into data memory require wait states, however. The 
input channel sends a hardware acknowledge when it is ready to end its 
bus (read) cycle; thus, the input device controls the number of wait states. 
The output channel functions properly with five wait states. 


To set the wait state registers for this example system, execute the 
following instructions: 

pmwait=0x0021 ; {RAM = no waits} 
dmwait=0xC401 ; {RAM = no waits, 

in_channel = ext. hardware-generated ACK, 
out_channel = automatic 5 cycle waits} 


The on-chip memory bank select decoding simplifies the hardware 
memory interface shown in Figure 8.5. Notice that three memory devices 
are required on the data memory side: a RAM storage area, an input 
device, and an output device. The two I/O channels could be A/D or 
D/ A converters or buffers to a host computer bus, for example. 

When the processor accesses a data memory location, one of four memory 
select lines is activated. Program memory space is likewise divided into 
two banks. Figure 8.4 shows the standard memory subdivision created by 
the initial values of the PMBANK and DMBANK registers after processor 
reset. In this example, the following lines of code modify this memory 
configuration to the one shown in Figure 8.6. 

pmbankl = 0x000800; {change PMS1- start address} 

dmbankl = 0x00001000; {change DMS1- start address} 

dmbank2 = 0x00002000; {change DMS2- start address} 

Initializing address and data registers. The same considerations 
presented in example #1 apply here. In this example, circular addressing 
is not used as reflected in the following code fragment: 


10=0; 11=0; 18=0; 
ml=l; m8=l; 

The index (I) registers and the base (B) registers are set in this example 
during the two instructions which immediately follow delayed call 
instructions to the subroutines which use the DAG registers. 

Initializing memory locations and buffers. In the software development 
stage, RAM or ROM buffers in memory can be initialized by the assembler 
using directives such as: 

.VAR cosine [256] = "cos.dat"; 

.VAR list [4 ] = 18.37, 1.0, -300.28769, 0.0; 
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In actual hardware, however, RAM sections cannot make use of these 
assembly-time initializations. The PROM Splitter creates files which 
PROM programmers use to initialize ROM memory. Emulation tools 
allow downloading initialized values to RAM memory. In the latter case, 
the initialization only occurs once — before processor reset. If the processor 
changes the initialized memory, subsequent reset operations would not 
re-initialize memory, causing the system to restart in a state that is 
different than the original one. 

A good programming practice is to write code which reliably reinitializes 
your RAM memory buffers during the initial setup phase after processor 
reset. In this example, a subroutine is called to zero out the biquad filter 
delay element storage locations. 


call cascaded_biquad__init (db) ; {zero the delay line} 

r 0=SECTIONS ; 
bO=dline; 


Here the BO (and 10 automatically) registers are initialized as well. 

Configuring and initializing on-chip peripherals. The timer on the 
ADSP-21020 is utilized in this example for creating the sampling 
interrupts which control the filtering operation. In our example, the 
ADSP-21020 is clocked at a 20.0 MHz rate. The timer is configured for a 
sampling interval such that the sampling frequency is 100 kHz. The 
TPERIOD register is set accordingly as well as TCOUNT. 

tperiod=199; {100 kHz intervals at 20.0 MHz CLKIN } 

tcount=l 99; 


The general formula for calculating the proper TPERIOD value is: 

Interrupt Rate = CLKIN frequency/ (TPERIOD+l) 

An interrupt rate of 9.6 kHz, for example, with a 20 MHz CLKIN 
frequency requires a TPERIOD value of 0x822. TCOUNT is the register 
which decrements during every processor cycle, and TPERIOD holds the 
value which is automatically reloaded into TCOUNT when the timer 
expires (decrements to zero and causes an interrupt). 
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Configuring and enabling interrupts. Good programming practice 
dictates that all setup operations should conclude before interrupts are 
allowed to affect program execution. The last two setup tasks are to 
configure the interrupting scheme and then enable interrupts to be 
recognized. In this example, the only interrupt being used is from the 
timer, which controls the sampling rate of the filter. 

Table 8.3 shows all five registers in the ADSP-21020 which affect interrupt 
configuration. Only some of the functions controlled by these registers are 
used in this example; the others are left in their default states. See the 
Interrupts section in Chapter 3 for complete information on these 


registers. 


Name 

Function 

IMASK 

which interrupts are to be recognized? 

IMASKP 

what to do in the case of interrupt nesting? 
(configured by processor automatically) 

IRPTL 

which interrupts have occurred? 

MODE1 

bit 12 turns interrupts on or off 
bit 11 turns interrupt nesting on or off 

MODE2 

bit 5 turns the interval timer on or off 
bits 0-3 set IRQO-3 edge- or level-sensitive 


Table 8.3 Interrupt-Related Registers 
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These registers are set in this example as follows: 

bit set imask 0x10; 

bit set mode2 0x20; 

bit set model 0x1000; {last initial setup} 

Using standard definitions in the #include file called // def21020.h," shown 
in Listing 8.5, the bit positions specified by the values in these instructions 
translate to more readable bit names: 

#include "def21020.h" {place at top of file} 


bit set imask TMZHI; 

bit set mode2 TIMEN; 

bit set model IRPTEN; {last initial setup} 

The IMASK register is set in such a way to allow the timer to interrupt the 

processor. This register is automatically cleared to zero during a procesor 
reset. The timer has two different mask bits associated with it: 

TMZHI (bit position 4 or 0x00000010) 

TMZLI (bit position 14 or 0x00004000) 

The timer interrupts are described in Chapter 5. You may select to either 
use the higher priority or the lower priority interrupt. The higher priority 
one was chosen in this example, but since no other interrupts are being 
used, either position could have been selected. 

The timer is enabled by setting TIMEN to 1 in the MODE2 register. Once 
the timer is enabled, it automatically decrements the TCOUNT register 
once during every processor cycle. During this initial setup phase, the 
TCOUNT and the TPERIOD registers are typically set to the same value. 
This gives the processor some time to finish the last few setup instructions 
before going to the main loop and waiting there for interrupts. Keep in 
mind that as soon as you enable the timer, it begins to decrement on the 
next cycle. 



8-27 



ft* 

1 %S 


Ul 


%#■ 


^ & v X, 




The IRPTL register is where interrupt requests are latched and cleared. 
This register is unaffected by a processor reset, and conseqently it is the 
programmer's responsibility to clear this register before enabling 
interrupts. This can be done with any of the following equivalent 
instructions: 


irptl = 0; 

bit clr irptl OxFFFFFFFF ; 

It is good programming practice to execute this instruction just before 
executing the instruction which enable interrupts (IRPTEN bit in the 
MODE1 register). 

The MODE1 register has two bits (NESTM, IRPTEN) which impact 
interrupt operation. This register is automatically cleared to zero during a 
processor reset. The NESTM bit enables interrupt nesting. This example 
does not use nesting, so this bit is left unaltered after processor reset. The 
IRPTEN bit is the global interrupt enable bit. In order for the ADSP-21020 
to service any interrupts whatsoever, this bit must be set. It is the 
programmer's responsibility to set this bit to a 1 . It is good programming 
practice to only do so once all other initial setup operations are complete. 
The last instruction before the main processing loop section of code 
should be this: 


bit set model 0x1000; 

or 

#include "def21020.h" 


bit set model IRPTEN; 

8.3.32 Main Processing Loop 

Having completed all the necessary initial setup operations, the program 
is ready to execute the main processing loop. The main processing loop 
typically consists of either: 

• a list of tasks which eventually terminates, possibly with interrupt 
intervention, or 

• an endless loop, waiting to be interrupted, in which most tasks are 
performed during interrupt service routines. 
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In this example, the second method is implemented. The endless loop 
consists of nothing more than: 


wait: idle; 

jump wait; 


This keeps the ADSP-21020 in an idle state, waiting for an interrupt to tell 
it to process the next sample. It is good programming practice to be 
executing an IDLE instruction while waiting for interrupts (without doing 
anything else) because this technique lowers the power consumption of 
the processor in a system. The IDLE instruction is described in Chapter 9. 

The total power budget is calculated by summing the power dissipated in 
idle mode as well as the power dissipated in servicing interrupts. A 
shorter interrupt service routine means that a greater percentage of time is 
spent using less power. 

8.3.3.3 Terminating The Main Processing Loop 

This main processing loop runs indefinitely, without termination. To stop 
execution during simulation, however, open the input port simulation file 
and select Autowrap=NO as an option. This causes the simulator to stop 
when the end-of-file (EOF) is reached in the input file. More details on this 
follow. 

8.3.4 Creating The Executable Program 

The executable program for this system example is created using these 
commands to invoke the ADSP-21000 Family Assembler and Linker: 


as m2 lk iirirq NOTE: def21020.h must be in current directory 

asm21k cascade 

ld21k iirirq cascade -a iirirq -m 

8.3.5 Simulation 

This example system can be simulated using these commands to invoke 
the ADSP-21000 Family Simulator: 

sim21k -e iirirq -a iirirq NOTE: input.dat must be in current directory 

In this example, the input samples are read from an I/O port simulation 
file. The file called "input.dat" is chosen as the input data. The contents of 
this file (see Listing 8.6) represent a normalized unit impulse function. 
Notice that the file can contain comments. 



> ■ 
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^ ii 




mum 
% J IT; 



Once inside the simulation session, open the ports, and use the 
Autowrap=NO option. After simulation is complete, the output file 
generated by writing to the simulated output port should contain the 
filter's impulse response function (see Listing 8 . 7 ). 

1.000 This is the input data 

0.000 for the biquads 

0.000 
0.000 


0.000 

o.ooo (300 samples total) 

Listing 8.6 input Data Read by Input Port (Normalized Unit Impulse) 


1.000000000 

1.787410300 

1.332763443 

0.832507949 


0.042204879 

0.047202070 (300 samples total) 

Listing 8.7 Output Data Stored by Output Port (Impulse Response) 
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8.4 CALLED SUBROUTINES (cascade.asm) 

The two routines in the file "cascade.asm" perform the cascaded biquad 
HR filtering operations: 

cascaded_biquad_init (clears delay line storage elements) 

cascaded_biquad (passes a sample through filter) 

Each routine begins with a program memory label (cascaded _biquad_init or 
cascaded Jbiquad) and ends with an RTS instruction. These labels are global 
(declared with the . global directive), which makes their names known to 
other files, for example, the main program which calls these subroutines. 
The other labels ( clear and quads) remain unknown outside cascade.asm 
because they are not declared global. It is not possible to refer to them by 
name from another file. 

The routines in cascade.asm demonstrate several important programming 
concepts, namely: 

• Writing looped code 

• "Rolling" loops for more efficient code 

• Multifunction instructions and associated register restrictions 

8.4.1 Writing Looped Code 

Looped code is easily written using the nestable DO UNTIL construct. The 
advantage of using the DO UNTIL construct is that the ADSP-2I020 
automatically tests termination status and branches appropriately — 
without any programming or execution overhead. For counter-controlled 
loops, the ADSP-21020 even allows setting the loop counter register 
(LCNTR) in the same instruction cycle that the DO UNTIL instruction is 
executed. Of course, the loop can terminate on conditions other than LCE 
(loop counter expired), such as an arithmetic status flag. See Chapter 3 for 
more details and for a list of loop restrictions. 




The RO register is set by the calling program to tell the cascaded Joiquad 
routine how many biquad sections to compute. For example, a sixth-order 
structure (which consists of three cascaded biquads) is computed if R0=3. 
The DO UNTIL loop is set up by the instruction: 

lcntr=rO, do quads until Ice; 

The assembly source code within the loop is: 

f 12=f 2*f 4 , f 8=f 8+f 12 , f 3=dm (iO , ml ) , 
f 12=f 3*f 4 , f 8=f 8+f 12 , dm (il , ml ) — f 3, 
fl2=f2*f4, f 8=f 8+f 12 , f 3=dm (iO , ml ) , 
quads: fl2=f3*f4, f8=f8+fl2, dm (il , ml ) =f 3 , 


f 4=pm ( i 8 , m8 ) ; 
f4=pm (i8,m8) ; 
f4=pm (i8,m8) ; 
f 4 —pm ( i 8 , rn8 ) ; 


Here is a cycle-by-cycle trace of the loop execution with r0=3: 


1 lcntr=rO, do quads unt 

2 f 12=f 2*f 4 , f 8=f 8+f 12 , 

3 f 12=f 3*f 4 , f 8=f 8+f 12 , 

4 f 12=f 2 *f 4 , f 8=f 8 + f 12 , 

5 fl2=f3*f4, f 8=f 8+f 12 , 

6 fl2=f2*f4, f 8=f 8+f 12 , 

7 fl2=f3*f4, f 8=f 8+f 12 , 

8 f 12=f 2*f 4 , f 8=f 8+f 12 , 

9 f 12=f 3*f 4 , f 8=f 8+f 12 , 

10 f 12=f 2*f 4 , f 8=f 8+f 12 , 

11 fl2=f3*f4, f 8=f 8+f 12 , 

12 fl2=f2*f4, f 8=f 8+f 12 , 

13 f 12=f 3* f 4 , f 8=f 8 + f 12 , 

14 <next instruction afte 

15 <next instruction> , e' 


LI Ice; 

f 3=dm (iO, ml ) , f4=pm(i8,m8); 

dm (il , ml ) =f 3, f4=pm(i8,m8); 

f 3=dm (iO , ml ) , f4=pm(i8,m8); 

dm (il, ml) =f3, f 4=pm (i8, m8) ; 
f3=dm (i0,ml) , f4=pm(i8,m8); 

dm (il , ml ) =f 3, f4=pm(i8,m8); 

f 3=dm (iO, ml ) , f4=pm(i8,m8); 

dm (il,ml) =f3, f4-=pm(i8,m8); 

f 3=dm (iO, ml ) , f4=pm(i8,m8); 

dm (il, ml) =f 3, f4=pm(i8,m8); 

f 3=dm (iO, ml ) , f4=pm(i8,m8); 

dm (il, ml) =f 3, f 4=pm (i8,m8) ; 
loop code> 
c . 


The above code is extremely efficient. Many resources are operating 
concurrently during every instruction cycle. Guidelines for efficiency in 
looped code are described in the following section. 

8.4.2 Rolling Loops For More Efficient Code 

"Rolling" a loop means pipelining operations to minimize instructions 
within a loop, exploiting the ADSP-21 020's parallel architecture to 
maximize concurrent operations. This involves scheduling operations and 
adding some extra lines of code before and after the loop to "fill" and 
"drain" the pipeline. The basic rolled structure is shown in Figure 8.7. 
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piped ) 





loop prologue 

pipe(l) 

pipe(2) 




loop prologue 

pipe(l) 

pipe(2) 

pipe(3) 



loop prologue 

pipe(l) 

pipe(2) 

pipe(3) 



loop prologue 

piped) 

pipe(2) 

pipe(3) 

pipe(n-l) 


loop prologue 

piped) 

pipe(2) 

pipe(3) ... 

pipe(n-l) 

pipe(n) 

loop body (iterate here) 


pipe(2) 

pipe(3) 

pipe(n-l) 

pipe(n) 

loop epilogue 



pipe(3) 

pipe(n-l) 

pipe(n) 

loop epilogue 




pipe(n-l) 

pipe(n) 

loop epilogue 




pipe(n-l) 

pipe(n) 

loop epilogue 





pipe(n) 

loop epilogue 


Loop prologue Instructions to fill the pipeline 

Loop body Instructions executed during looped steady state 

Loop epilogue Instructions to drain the pipeline 

Figure 8.7 Filling and Draining the Pipeline 

Figure 8.8 shows the instructions to be executed for the three biquad 
sections in this example. The operations are listed in chronological order 
and are vertically arranged according to the computation unit or memory 
bus used. When these operations are consolidated into multifunction 
instructions to match the model shown in Figure 8.7, the code in Listing 
8.8 results. 


1 


f 8=<input 

data>; 


2 

f 1 2 = 0 ; 





*** begin first section *** 



3 



f 2=dm (iO , ml ) , 

f 4=pm ( i8 , m8 ) ; 

4 

f 1 2=f 2 * f 4 , 


f 3=dm (iO , ml ) , 

f 4=pm (i8, m8) ; 

5 

f 12=f 3*f 4 , 

f 8=f 8+f 12 , 

dm (il, ml) =f 3, 

f 4=pm (i8, m8) ; 

6 

* 

Cs] 

4-< 

II 

CN 
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f 8=f 8+f 12 , 


f 4=pm ( i 8 , m8) ; 

7 

f 12=f 3*f 4 , 

f 8=f 8+f 12 , 

dm (il, ml) =f 8; 


8 


f 8=f 8+f 12 ; 




*** begin second section *** 



9 



f 2=dm ( iO , ml ) , 

f 4=pm (i8, m8) ; 

10 

f 12 = f 2*f 4 , 


f3=dm (i0,ml) , 

f 4=pm (i8, m8) ; 

11 

f 12 = f 3*f 4 , 

f 8=f 8+f 12 , 

dm(il,ml) =f3. 

f 4=pm (i8, m8) ; 

12 

f 12 = f 2*f 4 , 

f 8=f 8+f 12 , 


f4=pm (i8,m8) ; 

13 

fl2=f3*f4, 

f 8=f 8+f 12 , 

dm (il , ml ) =f 8 ; 


14 


f 8=f 8+f 12; 




*** begin third section *** 



15 



f 2=dm ( iO , ml ) , 

f 4=pm ( i8 , m8 ) ; 

16 

f 12=f 2 *f 4 , 


f 3=dm (iO, ml ) , 

f 4=pm (i8, m8) ; 

17 

f 12— f 3*f 4 , 

f 8=f 8+f 12 , 

dm (il , ml ) =f 3 , 

f 4=pm ( i8 , m8 ) ; 

18 

f 12=f 2*f 4 , 

f 8=f 8+f 12 , 


f 4=pm ( i 8 , m8) ; 

19 

f 12— f 3*f 4 , 

f 8=f 8+f 12, 

dm (il , ml ) =f 8 ; 


20 


f 8=f 8+f 12 ; 




21 <output data>=f8; 

Figure 8.8 Loop Code Before Rolling 
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8.4.3 Multifunction Instructions And Register Restrictions 

The 48-bit wide instruction word provides great single-cycle flexibility 
and parallelism in the ADSP-21020 architecture. For example, the ALU, 
multiplier, program sequencer and two separate address generators can 
all function simultaneously on a wide selection of input and output 
registers. There are, however, tradeoffs to be made when several blocks 
are simultaneously active. For example, multifunction instructions require 
indirect addressing using I and M registers because the instruction word 
is not wide enough to accommodate multiple instructions and direct, 
immediate address or modify amounts. 

The multiport register file (F0-F15 or R0-R15) can normally be read from 
and written to without restriction; however, in multifunction instructions, 
the ALU and multiplier inputs are restricted to particular sets of registers, 
while the outputs are unrestricted. The architecture dictates that when 
ALU and multiply operations are concurrent, the multiplier X-input may 
be either FO, FI, F2 or F3 while the multiplier Y-input is chosen from F4, 
F5, F6 or F7. The ALU X-input may be F8, F9, F10 or Fll while the Y-input 
is chosen from FI 2, FI 3, FI 4 or FI 5. In floating-point multiply/ 
accumulates, the destination of the ALU is typically the same register as 
one of its input registers (i.e., the previous accumulated total). 

In the quads loop in Listing 8.8, the register restrictions for multifunction 
instructions do not deter efficient computation. Note also that for ease of 
programming and legibility, the pipes of the multifunction instructions 
are vertically aligned. 

cascaded_biquad: 


b 1 =b 0 ; 

r8=r8 xor r8, 

. f 2=dm (iO , 

ml) , f4=pm(i8, 

m8) ; 


lcntr=rO, do 

quads until Ice; 



f 12=f 2*f 4 , 

f 8=f 8+f 12 , 

f 3=dm (iO,ml) , 

f 4=pm i 

00 

& 

00 

-H 

f 12=f 3*f 4 , 

f 8=f 8+f 12 , 

dm (il , ml ) =f 3 , 

f 4=pm i 

(i8 , m8 

f 12=f 2*f 4 , 

f 8=f 8+f 12 , 

f 2=dm (iO , ml ) , 

f 4=pm i 

( i8 , m8 

f 12=f 3*f 4 , 

f 8=f 8+f 12 , 

dm ( il , ml ) =f 8 , 

f 4=pm i 

00 

E 

00 

-H 

rts (db) , 

f 8=f 8+f 12 ; 





nop; 

nop; 

Listing 8.8 Final Rolled Loop Example in “cascade.asm” 
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8.5 DEVELOPING THE HR FILTER AND COEFFICIENTS 

The HR filter can be developed using computer-aided filter design 
software. In this example, FDAS (Filter Design and Analysis Software), 
which is a product of Momentum Data Systems, was used. The file created 
by FDAS is shown in Listing 8.9. This file lists the filter specifications 
input by the user, as well as the filter coefficients computed by FDAS. 

FILTER COEFFICIENT FILE 
HR DESIGN 

FILTER TYPE BAND PASS 

ANALOG FILTER TYPE ELLIPTIC 

PASSBAND RIPPLE IN -dB -.1000 

STOPBAND RIPPLE IN -dB -1.0000 

PASSBAND CUTOFF FREQUENCIES .400000E+03 .500000E+03 HERTZ 

STOPBAND CUTOFF FREQUENCIES .300000E+03 . 600000E+03 HERTZ 

SAMPLING FREQUENCY .800000E+04 HERTZ 

FILTER DESIGN METHOD: BILINEAR TRANSFORMATION 

FILTER ORDER 6 0006h 

NUMBER OF SECTIONS 3 0003h 

NO. OF QUANTIZED BITS 32 0020h 

QUANTIZATION TYPE - FLOATING POINT 

COEFFICIENTS SCALED FOR FLOATING POINT IMPLEMENTATION 


. 67730926E-02 

/* 

overall 

gain 


*/ 

.00000000 

/* 

section 

1 

coefficient 

B1 

*/ 

-1.0000000 

/* 

section 

1 

coefficient 

B2 

*/ 

1.8039191 

/* 

section 

1 

coefficient 

A1 

*/ 

-.92128010 

/* 

section 

1 

coefficient 

A2 

*/ 

-1.7640328 

/* 

section 

2 

coefficient 

Bl 

*/ 

1.0000000 

/* 

section 

2 

coefficient 

B2 

*/ 

1.8060702 

/* 

section 

2 

coefficient 

A1 

*/ 

-.96266572 

/* 

section 

2 

coefficient 

A2 

*/ 

-1.9376569 

/* 

section 

3 

coefficient 

Bl 

*/ 

1.0000000 

/* 

section 

3 

coefficient 

B2 

*/ 

1.8791107 

/* 

section 

3 

coefficient 

A1 

*/ 

-.97108089 

. /* 

section 

3 

coefficient 

A2 

*/ 


Listing 8.9 FDAS File 

8.5.1 Normalized b 0 Coefficient Biquad Filter Design Method 

Many digital filter design techniques exist for determining filter 
coefficients. To minimize coefficient quantization effects, HR filters of high 
order are usually implemented as cascaded biquad sections. Each biquad 
section requires five filter coefficients, three feedforward and two 
feedback. These sections can be in normal order or transposed order. Refer 
to texts on digital filters for further information. 
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The dynamic range offered by floating-point numbers allows us to 
normalize the coefficients in this example to the bo coefficient. FDAS offers 
bo normalization in its menu of filter design techniques. The value of bo 
becomes unity (1.0) and therefore any multiplication by this coefficient 
does not have to be carried out. The biquad rewritten without bo 
multiplication results in a four-coefficient, four-instruction-cycle per 
biquad routine. 

8.5.2 DSP Code Generation 

Filtering routines are general-purpose; only coefficients and the length of 
the coefficient and delay-line buffers need to change for different filters. 
Buffer lengths are modified by changing the #define preprocessor 
directives, and reassembling and relinking the source code. The filter 
coefficients themselves are changed by simply extracting, using any 
standard text editor, the floating-point coefficient values from the file 
which FDAS generates, placing them in another file that the source code 
references for coefficient buffer initialization. An alternative is to simply 
delete all lines in the FDAS file which are not coefficient values. In that 
case, make sure the filename is the one which the ADSP-21020 source code 
references for coefficient buffer initialization. For example. Listing 8.2 
refers to "iircoefs.dat." 

8.5.3 Coefficient Formatting 

The coefficients in the coefficient buffer are initialized in the ADSP-21020 
source code with the .VAR assembler directive. Note that this directive not 
only defines the buffer, but also initializes its contents with the values in 
the specified file, "iircoefs.dat." 

.var coefs [SECTIONS* 4 ] = "iircoefs.dat"; 

The contents of "iircoefs.dat" are shown in Listing 8.10. This file was 
created by editing the FDAS file shown in Listing 8.9. 

-.92128010 

1.8039191 

- 1.0000000 

.00000000 

-.96266572 

1.8060702 

1.0000000 

- 1.7640328 

-.97108089 

1.8791107 

1.0000000 

- 1.9376569 


Listing 8.10 “iircoefs.dat” File 



When saving the filter coefficients in FDAS, select the maximum allowable 
bits per coefficient (i.e., the least amount of quantization error). The 
floating-point coefficient values are stored in ASCII representation and 
read in the ADSP-21020 initialization in ASCII representation. For this 
reason, it is a good idea to use as many ASCII digits as possible. 


8.6 PROGRAMMING HINTS 

This section describes good and poor programming practices. Strict 
adherence to these suggestions is not required but is strongly 
recommended. 

See Chapter 7 for a list of programming reminders and restrictions, as well 
as an overview of the instruction set. 

8.6.1 System Considerations For Scoping 

The scope of variables and code labels refers to where they are declared 
and which portions of software know of their existence. Variables and 
code labels can be global or local in scope. They must always be declared 
within a .SEGMENT definition. For example: 

Do This: 

.SEGMENT /DM 
.VAR 
.VAR 
. ENDSEG; 

By default, a .VAR declaration forces this variable to remain local within 
the file in which it is declared. To make a variable or code label global, it 
must be declared global using the .GLOBAL directive in the same file in 
which the variable or code label is originally declared. Any other file must 
use the .EXTERN directive to make a global variable known to it. 

The context in which a subroutine is used can suggest how to scope the 
variables it references. For example, routines such as the HR filtering ones 
shown here may be used to implement many different filters. It may even 
be included in a library of general-purpose routines. In such a case, the 
variables the routine uses ( coefs and dline) should be declared in the main 
routine (iirmem.asm or iirirq.asm) and not in the called routine 
(cascade.asm). This is because each filter has a unique set of coefficients 
and delay line storage. 


dm_data; 

inbuf [SAMPLES]; 
outbuf [SAMPLES]; 



a 


Programming 



Notice that the example subroutines in "cascade.asm" do not refer to coefs 
or dline by name. For that reason, it is not necessary (nor desirable) to use 
the .GLOBAL or .EXTERN directives. The called subroutines only need to 
know the start addresses of the coefs and dline buffers. These addresses are 
passed to the subroutines by assignments to B or I registers in the calling 
program. The routine then uses the I registers for data addressing. This 
example shows those assignments being performed as the two 
instructions executed following a delayed branch call to the subroutines. 

Main Code: 

.VAR coefsl [SECTIONSM] = "iirl.dat"; 

.VAR coef s2 [SECTIONSM] = "iir2.dat"; 

.VAR coef s3 [SECTIONSM] = "iir3.dat"; 

.VAR dlinel [SECTIONS*2 ] ; 

.VAR dline2 [SECTIONS*2 ] ; 

.VAR dline3 [SECTIONSM ] ; 


call readem (db) ; 
bO=dlinel ; 
b8=coef si ; 

Subroutine: 

dm (il, ml) =f 3, f 4=pm (i8, m8) ; {10, i8 used} 

. . . {no names referenced} 


{loads i.O register} 
{loads i8 register} 


rts; 


This approach has two advantages. It is simple and quick to edit the main 
calling code only, allowing the subroutine to be general purpose. Also, 
less overhead is incurred when branching to the subroutine using a 
delayed branch. The two extra cycles following the delayed CALL 
instruction are conveniently employed to pass the coefficient and delay 
line buffer addresses to the subroutine. 
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For example: 

Do This: 

call eascaded__biquad (db) ; 
bO=dlinel ; 
b8=coefsl; 

{takes 3 cycles} 

Not Tl\is: 

bO=dlinel ; 
b8=coef si ; 

call cascaded_biquad; 

{takes 5 cycles} 

The former shows a delayed branch, the latter depicts a normal branch. 
The delayed branch saves two instruction cycles during execution. To use 
the faster delayed branch subroutine call, simply take two instructions 
from before the call and place them immediately after the delayed call. Do 
not use any of the restricted instructions such as branches or looping 
constructs as these two. See Chapter 3 for details on delayed branching. 

8.6.2 Delayed Branches 

Understanding delayed branches requires a non-intuitive leap. 

Instructions following a branch instruction get executed before program 
flow continues at the branch destination. For this reason, a good 
programming practice is to highlight these two instructions. In the same 
way that loop nesting is emphasized by indentation, we use indentation to 
highlight the two instructions following a delayed branch instruction. This 
convention is not confused with loop indentation in this example, because 
this example uses only a single space indentation. When reading code, the 
space reminds you that these instructions execute before the branch is 
taken. 

The two instructions after a delayed branch can be used to pass 
parameters to the destination code branched to, especially when the 
delayed branch is a subroutine call. If the branch is a delayed return from 
a subroutine or interrupt, the two instructions associated with the delayed 
branch may actually finish up the subroutine or interrupt service tasks. 
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Here is an example code fragment showing indentation used for nested 
loops as well as for delayed branches. Note that for loops and delayed 
branches, a different amount of indentation distinguishes from the other. 

#def ine DATASETS 4 

#def ine SAMPLES 300 


lcntr=DATASETS, do sets until Ice; 

lcntr=SAMPLES, do filtering until Ice; 
f 8=dm (i3, 1 ) ; 
call biquad (db) ; 

b0=dline; {executes BEFORE biquad routine begins} 
b8=coefs; {executes BEFORE biquad routine begins} 
filtering: dm(i4,l)=f8; 

sets: nop; 


8.6.3 Multifunction Instruction Coding 

Multifunction instructions which activate the ALU, the muliplier and the 
two DAGs simultaneously can be quite lengthy. To graphically depict 
operations in progress as well as resource utilization, it is good 
programming practice to write sequential multifunction instructions in 
such a way that the parts of the multifunction instructions that use the 
same resource (e.g., ALU operations) line up vertically. If the instruction 
does not use a resource, the space for that resource is left blank. The quads 
loop in Listing 8.8 in the cascaded Jbiquad subroutine shows this vertical 
alignment. 

This graphically shows resource utilization and, more importantly for 
hand code compaction, resource non- utilization. The programmer may 
find a way later to fill those blank spaces (i.e., use the unused resource) 
and reduce the total instruction count. This is described in an earlier 
section, "Rolling Loops." 


8.7 COMPLETE FFT EXAMPLE 

In addition to the two HR examples presented in this chapter, an FFT 
program is presented in this section to show a larger example. This 
program features efficient memory usage in conjunction with fast 
computational throughput. 
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roqramming Tutors 


{ 

FFTRAD4.ASM ADSP-21020 Radix-4 Complex Fast Fourier Transform 

This routine performs a complex, radix 4 Fast Fourier Transform (FFT) . The FFT 
length (N) must be a power of 4 and a minimum of 64 points. The real part of 
the input data is placed in DM and the complex part in PM. This data is 
destroyed during the course of the computation. The real and complex output of 
the FFT is placed in separate locations in DM. 

Since this routine takes care of all necessary address digit-reversals, the 
input and output data are in normal order. The digit reversal is accomplished 
by using a modified radix 4 butterfly throughout which swaps the inner two 
nodes resulting with bit reversed data. The digit reversal is completed by 
bit reversing the real data in the final stage and then bit reversing the 
imaginary so that it ends up in DM. 

To implement an inverse FFT, you only have to (1) swap the incoming data, real 
and imaginary parts, (2) run the forward FFT, and (3) swap the outgoing data, 
real and imaginary parts . 

For this routine to work correctly, the program "twidrad4.C" must be used to 
generate the special twiddle factor tables for this program. 

Author: Karl Schwarz & Raimund Meyer, Universitaet Erlangen Nuernberg 

Revision: 27-MAR-91, Ronnin Yee, Analog Devices, DSP div., (617) 461-3672 

Calling Information: 


costwid table at DM 

: cosine 

length 3* 

r N/ 4 


sintwid table at PM 

: sine 

length 3* 

r N/ 4 


real 

input at DM : 

redata 

length N, 

normal 

order 

imag 

input at PM : 

imdata 

length N, 

normal 

order 

Results : 






real 

output at DM : 

ref ft 

length N, 

normal 

order 

imag 

output at DM : 

imfft 

length N, 

normal 

order 


(Note: Because the bit reversed addressing mode is used with the arrays 

ref ft and imfft, they must start at addresses that are integer 
multiples of the length (N) of the transform, (i.e. 0, N, 2N, 3N, . . . ) . 

This is accomplished by specifying two segments starting at those addresses 
in the architecture file and placing the variables alone in their 
respective segments. These addresses must also be reflected in the 
preprocessor variables ORE and OIM in bit reversed format.) 

Altered Registers: 

All I, M, L and R registers. 

Three levels of looping. 
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Benchmarks: radix-4, complex with digit reversal 

FFT Length cycles ms @ 20 MHz CLK ms @ 25 MHz CLK 


64 
256 
1024 
4096 
16384 

First Stage - 8 cycles per radix-4 butterfly 
Other Stages - 14 cycles per radix-4 butterfly 


920 .046 .037 

4044 .202 .162 

19245 .962 .770 

90702 4.535 3.628 

419434 20.972 16.777 


Memory Usage: 

pm code = 192 words, pm data = 1.75*N words, dm data = 3.75*N words 


Assembler Preprocessor Variables: 

N Number of points in FFT. Must be a power of four, minimum of 64. 

STAGES Set to log4 (N) or (log (N) /log (4 ) ) 

OST = bit rev (32 bit N/2) 

ORE = bitrev(32 bit addr of output real in dm), addr is 0, N, 2N, 3N, . . . 

OIM = bitrev (32 bit addr of output imag. in dm), addr is 0, N, 2N, 3N, . . . 

} 


{ include for symbolic definition of system regster bits } 
#include "def21020.h" 


{ The constants below must be changed for different length FFTs 

N = number of points in the FFT 

STAGES - log4 (N) 

OST = bitrev (0x00000080=N/2) , used as a modifier for bit reversal 
ORE = bitrev (0x00000000=output real in dm) 

OIM = bitrev (0x00004000=output imag in dm) 


#define N 
#def ine STAGES 
#define OST 
#define ORE 
#define OIM 


256 

4 

0x01000000 

0x00000000 

0x00020000 




.SEGMENT/DM dm_data; 

.VAR cosine [3*N/4] ="tc.dat"; {Cosine twiddle factors, from TWIDRAD4 program} 
.VAR redata [N] ="inreal .dat"; { Input real data } 

.GLOBAL redata; 

.ENDSEG; 

.SEGMENT/DM dm_rdat; { this segment is an integer multiple of N } 

.VAR refft[N]; { Output real data } 

.GLOBAL ref ft; 

.ENDSEG; 



.SEGMENT /DM 

dm_idat ; 

{ this segment is an integer multiple of N } 

.VAR imf ft [N] ; 


{ Output imaginary data } 

.GLOBAL imf ft; 



.ENDSEG; 



.SEGMENT/PM 

pm data; 


.VAR sine [3*N/4]=' 

"ts.dat"; 1 

[ Sine twiddle factors, from TWIDRAD4 program} 


.VAR imdata [N] ="inimag.dat"; { Input imaginary data } 

.GLOBAL imdata; 

.ENDSEG; 

.SEGMENT/PM rst_svc; { program starts at the reset vector } 

pmwait=0x0021; {pgsz=0,pmwtstates=0, intrn.wtstates only} 

dmwait=0x008421 ; {pgsz=0, dmwtstates=0, intrn .wt states only} 

call f ft ; 

stop: idle; 

nop; 

nop; 

.ENDSEG; 

. SEGMENT /PM pm_code ; 

f ft : 

{ first stage radix-4 butterfly without twiddles } 

iO^redata; 
il=redata+N/4; 
i2=redata+N/ 2 ; 
i3=redata+3*N/4; 
i4=i0; 
i5=il; 
i6=i2; 
i7=i3; 
mO=l; 
m8=l; 

i8=imdata; 
i9=imdata+N/4; 
ilO=imdata+N/2; 
ill=imdata+3 ,lr N/4 ; 
il2=i8; 
il3=i9; 
il4=il0; 
il5=ill; 

10 - 0 ; 

11 = 10 ; 

12 = 10 ; 

13 = 10; 

14 = 10; 

15 = 10; 

16 = 10; 

17 = 10; 

18 = 10; 
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19 = 10; 

110 = 10 ; 

111 = 10 ; 

112 = 10 ; 

113 = 10; 

114 = 10; 

115 = 10; 

f0=dm(i0,m0) , fl=pm(i8,m8) ; 
f2=dm(i2, mO) , f 3=pm (il0,m8) ; 

f0=f0+f2, f2=f0-f2, f4=dm(il,m0) , f5=pm(i9,m8) ; 

£1=11+13, £3=11-13, £6=dm (i3,m0) , 17=pm (ill,m8) ; 

£4=16+14, £6=16-14; 

f5=f5+17, £7=15-17; 

f 8=f 0+f 4, £9=10-14; 

f 10=f 1+15, £11=11-15; 


lcntr=N/4, do Istage until 

Ice; { 

do N/4 simple radix-4 

butterflies } 

£12=12+17, 

113=12-17, 

r fO=dm(iO,mO) , 

fl=pm(i8,m8) ; 

£14=13+16, 

f 15=f3-f 6, 

, f2=dm(i2,m0) , 

f3=pm (il0,m8) ; 

£0=10+12, 

f2=f0-f2. 

f4=dm(i.l,m0) , 

f5=pm(i9,m8) ; 

£1=11+13, 

f 3=f l-f3. 

f 6=dm (i3, mO) , 

f7=pm(ill,m8),; 

14=16+14, 

f 6=f 6-f 4, 

dm(i4,m0) =18, 

pm(il2,m8) =110 

£5=15+17, 

f 7=f 5-f7, 

dm (i5,m0) =f 9, 

pm (il3,m8) =f 11 

£8=10+14, 

f 9=f 0-f 4 , 

dm(i6,m0) =f 12, 

pm(il4,m8) =fl4 

Istage: 




110=11+15, 

111=11-15, 

r dm (i7, mO) =f 13, 

pm (il5,m8) =f 15 

{ Middle stages with radix-4 main butter fly 

} 

{ m0=l and m8=l is still preset 

} 



ml=-2; 

{ 

reverse step for twiddles } 

m9=ml ; 




m2=3; 

{ 

forward step for twiddles } 

ml0=m2; 




m5=4 ; 

{ 

first there are 4 groups } 

r2=N/16; 

' { 

with N/16 butterflies 

in each group 

r3=N/16* 

^3; { 

step to next group } 


lcntr=STAGES-2, do mstage until 

Ice; { 

do STAGES-2 stages } 


i7=cosine; { 

first real twiddle } 


il5=sine; { 

first imag twiddle } 


r8=redata; 

r9=imdata; 



CO 

U 

II 

o 

-H 

{ 

upper real path 

} 

rl0=r8+r2; i8=r9; 

{ 

upper imaginary path 

} 

il=rl0; 

{ 

second real input path } 

rl0=rl0+r2, i4=rl0; 

{ 

second real output path } 
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i2=rl0; 


{ third real input path } 

rl0=rl0+r2. 

i5=rl0; 


{ third real output path 


i3=rl0; 


{ fourth real input path 

rl0=r9+r2, 

i6=rl0; 


{ fourth real output path 


i9=rl0; 


{ second imag input path 

rl0=rl0+r2, 

il2=rl0; 

{ second imag output path 


ilO=rlO; 

{ third imag input path } 

rlO=rlO+r2, 

il3=rl0; 

{ third imag output path 


ill=rlO; 

{ fourth imag input path 


il4=rl0; 

{ fourth imag output path 


m4=r3; 

ml2=r3; 



r4=r3+l. 

m6=r2; 
m3=r4 ; 



r2=r2-l, 

mll=r4; 

m7=r2; 




lcntr=m5. 

do mgroup until 

Ice; { do m5 

groups } 
f 0=dm (i7,m0) , 

f5=pm(i9,m8) ; 

f 8-f 0*f5, 



f4=dm (il,mO) , 

fl=pm(il5,m8) ; 

f 9=f0*f4; 
f 12~fl*f5, 



f 0=dm (i7,m0) , 

f5=pm(ill,m8) ; 

f 13=f l*f 4, 

f8=f0*f4, 

fl3=fl*f5; 

f 12=f 9+f 12, 

f2=f 8-f 13; 

f 4=dm (i3,m0) , 

fl=pm(il5,m8) ; 

f 9=f 0*f5, 

f8=f 8+f 13, 


f0=dm(i7,ml) , 

f5=pm(il0,m8) ; 

fl3=fl*f4, 
f ll=f 0*f 4; 

f 12=f 8+f 12, 

f 14=f 8-f 12, 

f4=dm(i2,m0) , 

f I=pm(il5,m9) ; 

f 13=fl*f 5, 
f 9=f 0*f 5, 

f 13=f 11+f 13, 

f 6=f 9-f 13; 

fll=dm(i0,0) ; 


fl3-fl*f4, 

f 8=f 11+f 13, 

f 10=f 11-f 13; 




[ Do ml radix-4 butterflies } 


lcntr=m7. 

do mr4bfly until Ice; 





f 13=f 9-f 13, 

f4=dm(il,m0) , 

f5=pm(i9,m8) ; 


f2=f2+f 6, 

f 15=f2-f 6, 

f0=dm(i7,m0) , 

fl=pm(il5,m8) ; 

f 8=f 0*f 4 , 

f 3=f 8+f 12, 

f 7=f 8-f 12, 


f 9=pm (i8, 0) ; 

fl2=fl*f5. 

f 9=f 9+f 13, 

f ll=f 9-f 13, 

f 13=f2; 


f8=f0*f5. 

f 12=f 8+f 12, 


f 0=dm(i7,m0) , 

f5=pm(ill,m8) ; 

fl3=fl*f4. 

f 9=f 9+f 13, 

f 6=f 9-f 13, 

f 4=dm (i3,m0) , 

fl=pm(il5,m8) ; 

f8=f 0*f 4, 


f2=f8-f 13, 

dm(iO,mO) =f3. 

pm(i8,m8)=f9; 

f 13=f l*f5, 

f ll=f 11+f 14, 

f7=f 11-f 14, 

dm(i4,m0) =f7. 

pm(il2,m8)=f6; 

f9=f0*f5. 

f 8=f 8+f 13, 


f0=dm(i7,ml) , 

f5=pm(il0,m8) ; 

f 13=f l*f4 , 

f 12=f 8+f 12, 

fl4=f 8-f 12, 

f4=dm (i2,m0) , 

fl=pm (il5,m9) ; 

f ll=f 0*f 4 , 

f3=f 10+f 15, 

f8=f 10-f 15, 


pm (il3,m8) =f 11 

fl3=fl*f5. 


f 6=f 9-f 13, 

dm(i6,m0) =f 8, 

pm(il4,m8)=f7; 

f 9=f 0*f 5, 

f 13=f 11+f 13, 


f ll=dm(iO, 0) ; 


mr4bfly : 
f 13=f l*f 4, 

f 8=f 11+f 13, 

f 10=f 11-f 13, 

dm(i5,m0) =f3; 



{ End radix-4 butterfly } 

{ dummy for address update * * } 
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f 13=f 9-f 13 

, f0=dm(i7,m2) , 

fl=pm(il5,ml0) ; 

f2=f2+f 6, 

f 15=f2-f 6, 

f0=dm(il,m4) , 

fl=pm(i9,ml2) ; 

f 3=f 8+f 12, 

f7=f 8-f 12, 


f 9=pm(i8, 0) ; 

f 9=f 9+f 13, 

f ll=f 9-f 13 

, f0=dm(i2,m4) ; 


f 9=f 9+f2, 

f 6=f 9-f2, 

f0=dm(i3,m4) , 

f l=pm (il0,ml2) ; 



dm(i0,m3) =f3. 

pm (i8,mll) =f 9; 

f ll=f 11+f 14, 

f7=f 11-f 14 

, dm(i4,m3) =f7. 

pm (il2,mll) =f 6; 

f3=f 10+f 15, 

f 8=f 10-f 15 

, 

pm (il3, mil) =f 1.1 



dm(i6,m3) =f8. 

pm (il 4, mil) =f 7; 

mgroup : 


dm(i5,m3) =f3. 

fl=pm(ill,ml2) ; 


r3=m4; 




rl=m5; 




r2=m6; 



r3=ashift r3 by -2; 

{ 

groupstep/4 } 


rl=ashift rl by 2; 

{ 

groups *4 } 



m5=rl; 



mstage: r2=ashift r2 by -2; 

{ 

butterflies/4 } 


{ Last radix- 

-4 stage 


} 


{ Includes bitreversal of the real data in dm } 

bit set model BRO; { bitreversal in iO } 

{ with: m0=m8=l preset } 

i4=redata; { input } 

il=redata+l; 

i2=redata+2; 

i3=redata+3; 

iO=ORE; { real output array base must be an integer multiple of N } 

m2=OST; 

i7=cosine; 

i8=imdata; { input } 

i9=imdata+l; 

il0=imdata+2; 

ill=imdata+3; 

il2=imdata; { output } 

il5=sine; 

ml=4; 

m9=ml ; 


f8=f0*f5, 
f 9=f0*f4; 
f 12=f l*f 5, 



f 0=dm (i7, mO) , 
f4=dm(il,ml) , 

f 0=dm (i7,m0) , 

f5=pm(i9,m9) ; 
f l=pm (il5,m8) 

f5=pm (ill,m9) 

f 13=fl*f4, 

f8=f0*f4, 

fl3=fl*f5; 

f 12~f 9+f 12, 

f2=f 8-f 13; 

f4=dm(i3,ml) , 

f l=pm (il5,m8) 

f 9=f 0*f5, 

f 8=f 8+f 13, 


f0=dm (i7,m0) , 

f5=pm (il0,m9) 

fl3=fl*f4, 
fll=f0*f4; 
f 13=f l*f5, 

f 12=f 8+f 12, 

f 14=f 8-f 12, 

f 6=f 9-f 13; 

f 4=dm (i2,ml) , 

fl=pm(il5,m8) 



f ll=dm(i4,ml) ; 


f9=f0*f5, fl3=fll+fl3, 

fl3~-fl*f4, f 8=f 11+f 13, f 10=f 11-f 13; 

{ Do N/4-1 radix-4 butterflies } 

lcntr=N/4-l, do 1st age until Ice; 




f 13=f 9-f 13, 

f4=dm(il,ml) , 

f5=pm (i9,.m9) ; 


f 2=f 2+f 6, 

f 15=f2-f 6, 

f 0=dm (i7,m0) , 

f l=pm (il5,m8) ; 

f 8=f 0*f 4, 

f 3=f 8+f 12, 

f 7=f 8-f 12, 


f 9=pm (i8,m9) ; 

f 12=f l*f 5, 

f 9=f 9+f 13, 

f ll=f 9-f 13, 

f 13=f2; 


f8«fp*f5. 

f 12=f 8+f 12, 


f0=dm(i7,m0) , 

f5=pm(ill,m9) ; 

f 13-f l*f 4 , 

f 9=f 9+f 13, 

f 6=f 9-f 13, 

f4=dm(i3,ml) , 

f I=pm(il5,m8) ; 

f 8=f0*f4, 


f2=f 8-f 13, 

dm (i0,m2) =f3. 

pm(il2,m8)=f 9; 

f 13=f l*f 5, 

f ll=f 11+f 14 , 

f 7=f 11-f 14, 

dm(i0,m2) =f7. 

pm(il2,m8) =f 6; 

f 9=f 0*f 5, 

f 8=f 8+f 13, 


f0=dm(i7,m0) , 

f 5=pm (il0,m9) ; 

f 13=f l*f 4 , 

f 12=f 8+f 12, 

f 14=f 8-f 12, 

f4=dm(i2,ml) , 

fl=pm(il5,m8) ; 

f ll=f 0*f 4 , 

f 3=f 10+f 15, 

f 8=f 10-f 15, 


pm(il2,m8)=fll 

f 13=f l*f 5, 


f 6=f 9-f 13, 

dm(i0,m2) =f 3, 

pm(il2,m8) =f 7; 

f 9=f 0*f 5, 

f 13=f 11+f 13, 


fll=dm(i4,ml) ; 


1 stage : 
f 13=f l*f 4 , 

f 8=f 11+f 13, 

f 10=f 11-f 13, 

dm(i0,m2)=f8; 




f 13=f 9-f 13; 




f 2=f 2+f 6, 

f 15=f2-f 6; 




f 3=f 8+f 12, 

f 7=f 8-f 12, 


f 9=pm(i8,m9) ; 


f 9=f 9+f 13, 

fll=f9-fl3. 

dm(i0,m2) =f3; 



f 9=f 9+f2, 

f 6=f 9-f 2 , 

dm (i0,m2) =f 7; 

pm(il2,m8)=f9; 


f ll=f 11+f 14 , 

f 7=f 11-f 14 , 


pm(il2,m8) =f 6; 


f 3=f 10+f 15, 

f 8=f 10-f 15, 

dm(i0,m2) =f3, 
dm(i0,m2) =f 8; 

pm (il2,m8) =f 11 
pm(il2,m8) =f7; 

{ Do 

the bitreversal of 

the imaginary part from pm to dm 

} 


i8=imdata; 

iO=OIM; { image output array base must be an integer multiple of N } 
f0=pm(i8,m8) ; 


lcnt.r=N-l, do pmbr until Ice; { do N-l bitreversals } 
pmbr: dm (i0,m2) =f0, fO=pm (i8,m8) ; 

rts (db) ; 
dm (i0,m2) =f 0; 
bit clr model BRO; 


. ENDSEG; 


{ no bitreversal in iO any more } 



Hardware System Design 


9.1 OVERVIEW 

This chapter describes considerations for designing hardware systems 
based on the ADSP-21 020/21010 processor. It also supplies examples of 
some common configurations. 

9.1 .1 Basic System Configuration 

Figure 9.1, on the following page, shows a basic configuration for a system 
based on the ADSP-21 020. The following two signals coordinate the 
operation of the ADSP-21 020 and other devices in the system: 

• The CLKIN signal comes from a clock oscillator that provides clocking 
for the ADSP-21 020 and other devices operating synchronously with it. 

• The RESET signal is provided by a circuit that resets all or part of the 
system. 

These signals are described in greater detail in later sections of this 
chapter. 

The basic configuration in Figure 9.1 features program memory, data 
memory and peripherals that are mapped into data memory space. The 
connections in each memory interface are: 

• Address buses (PMA23-0 and DMA31-0) 

• Data buses (PMD47-0 and DMD39-0) 

• Bank selects (PMS1-0 and DMS3-0) 

• Read signals (PMRD and DMRD) 

• Write signals (PMWR and DMWR) 

Example memory configurations for both single and multiple processors 
are shown later in this chapter. 
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Figure 9.1 Basic ADSP-21020 System Configuration 
9.1 .2 More Complex Configurations 

Other signals shown in Figure 9.1 but not connected in this configuration 
could be used in a system with more complex memory interface 
requirements: 

• Bus acknowledges (PMACK or DMACK), for hardware-controlled 
wait states. 

• Three-state enables (PMTS or DMTS), to hold the processor off the 
memory bus during an external cache update (for example). 

• Page fault indicators (PMPAGE or DMPAGE), for interfacing to page- 
mode and static-column dynamic RAMs (DRAMs). 
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Bus request (BR) and bus grant (BG), for granting the memory buses 
and control signals to another master. 









There are pins through which the ADSP-21020 sends and receives control 
signals to and from other devices in the system. These pins may or may 
not be used, depending on the system: 

• Hardware interrupts (IRQ3-0) can come from devices that require the 
ADSP-21020 to perform some task on demand. One of the memory- 
mapped peripherals, for example, can use an interrupt to alert the 
processor that it has data available. Interrupts are described in detail in 
Chapter 3. 

• The flags (FLAG3-0), each of which can be programmed to be an input 
or an output, allow signalling between the ADSP-21020 and another 
device. For example, the ADSP-21020 can raise an output flag to 
interrupt some other device. Flags are described in detail later in this 
chapter. 

• The TIMEXP output is controlled by the on-chip timer. It indicates to 
other devices that the programmed time period has expired. The timer 
is described in detail in Chapter 5. 

• The test access port (TCK, TMS, TDI, TDO and TRST) can be connected 
to a controller that performs a boundary scan for test purposes or for 
powerup boot loading of external program memory. This port is also 
used by the ADSP-21020 EZ-ICE® Emulator to access on-chip 
emulation features. Use of this emulator requires a connector for access 
to the test access port. The connector is described in this chapter, in 
section 9.9. The test access port is described in detail in Appendix C. 

9.2 CLOCKS & SYNCHRONIZATION 

The ADSP-21020 receives its clock input on the CLKIN pin. The processor 
uses an on-chip phase-locked loop to generate its internal clock. 

Because the phase-locked loop requires some time to achieve phase lock, 
CLKIN must be valid for a minimum time period during reset before the 
RESET signal can be deasserted; this time period is specified in the 
ADSP-21020 Data Sheet. 

9.2.1 Synchronization Delay 

The ADSP-21020 has several asynchronous inputs, namely, RESET, TRST, 
BR, IRQ3-0 and FLAG3-0 (when configured as inputs). These inputs can 
be asserted in arbitrary phase to the processor clock, CLKIN. The ADSP- 
21020 synchronizes them prior to recognizing them. The delay associated 
with recognition is called the synchronization delay. 


Any asynchronous input must be valid prior to the recognition point to be 
recognized in a particular cycle. If an input does not meet the setup time 
on a given cycle, it may be recognized in the current cycle or during the 
next cycle. 

Therefore, to ensure recognition of an asynchronous input, it must be 
asserted for at least one full processor cycle plus setup and hold time 
(except for RESET, which must be asserted for at least four processor 
cycles). The minimum time prior to recognition (the setup and hold time) 
is specified in the ADSP-21020 Data Sheet. 
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RESET halts execution and returns all registers to a state defined in Table 
9.1. On powerup, RESET must be asserted (low). After the clock is stable 
for a minimum period (specified in the ADSP-21020 Data Sheet), RESET 
can be deasserted. 

Table 9.1 shows the states of the processor registers after reset. If a value is 
unchanged, it is uninitialized at powerup. Table 9.2 shows the states of 
outputs during reset (i.e. while RESET is low). 


Pin Name 

Type 

State During Reset 

PMA 23 _o 

Output 

Driven, Value Undefined 

PMD 47 _ n 

Bidirectional 

High Impedance 

PMS 0 , PMSi 

Output 

One High, the other Low 

PMRD 

Output 

High 

PMWR 

Output 

High 

PMPAGE 

Output 

Low 

dma 31 _ 0 

Output 

Driven, Value Undefined 

DMD 3 q_o 

Bidirectional 

High Impedance 

DMS 0 

Output 

High 

DMSi 

Output 

High 

dms 2 

Output 

High 

dms 3 

Output 

High 

DMRD 

Output 

High 

DMWR 

Output 

High 

DMPAGE 

Output 

Low 

FLAGO 

Bidirectional 

High Impedance 

FLAG1 

Bidirectional 

High Impedance 

FLAG2 

Bidirectional 

High Impedance 

FLAG3 

Bidirectional 

High Impedance 

BG 

Output 

Depends on BR 

TIMEXP 

Output 

Low 

TDO 

Output 

Depends on TRST and TCK 
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Register 

PC 

PCSTK 
PCSTKP 
FADDR 
DADDR 
LADDR 
CURLCNTR 
LCNTR 
R15 - RO 
115 - 10 
M15 - MO 
L15 - LO 
B15 - BO 


Value after Reset 

0x0008 

unchanged 

0x0000 (cleared) 

0x0008 

unchanged 

unchanged 

unchanged 

0x0000 (cleared) 

unchanged 

unchanged 

unchanged 

unchanged 

unchanged 


MODE1 

MODE2 

IRPTL 

IMASK 

IMASKP 

ASTAT 


STKY 

USTAT1 

USTAT2 


0x0000 (cleared) 

OxhOOO 0000 (bits 28-31 are the device identification field, 
identifying the silicon revision #) 

0x0000 (cleared) 

0x0003 

0x0000 (cleared) 

0x00 nn 0000 (bits 19-22 are equal to the values of the 
FLAGO-3 input pins; the flag pins are 
configured as inputs after reset) 

0x0540 0000 
0x0000 (cleared) 

0x0000 (cleared) 


DMWAIT 

DMBANK1 

DMBANK2 

DMBANK3 

DMADR 

PM WAIT 

PMBANK1 

PMADR 

PX 

PX1 

PX2 

TPERIOD 

TCOUNT 


OxOOOF 7BDE 

0x2000 0000 

0x4000 0000 

0x8000 0000 

unchanged 

0x0003DE 

0x800000 

unchanged 

unchanged 

unchanged 

unchanged 

unchanged 

unchanged 


Table 9.1 ADSP-21020 Register Values After Reset 



The timing of the program memory interface for the first instruction fetch 
after a reset is shown in Figure 9.2 below. The first address output is the 
reset vector, 0x000008. PMPAGE is asserted because this is the first access 
to this page of memory. PMS0 is asserted and PMST deasserted because 
0x000008 lies in bank 0 in the default configuration of memory banks. 
During the first two memory accesses, which have seven wait states each 
(due to the default value of the PMWAIT register), the first instruction is 
fetched and decoded. It is executed when the fetch of the third instruction 
begins. 


First 

Instruction 

Executed 



Figure 9.2 Program Memory Interface Timing at Reset 
9.4 RCOMP PIN 

The ADSP-21020's RCOMP pin is a compensation resistor input that 
controls the processor's output driver /buffers. 

To reduce system noise at low temperatures when transistors switch 
fastest, the ADSP-21020 employs compensated output drivers. These 
drivers equalize slew rate over temperature extremes and process 
variations. A 1.8 k £2 resistor placed between the RCOMP pin and 
EVDD (+5 V) provides a reference for the compensated drivers. Use of a 
capacitor, approximately 100 pF, placed in parallel with the 1.8 kQ 
resistor, is recommended. 


9-6 



9.5 FLAGS 

Four external pins on the ADSP-21020 — FLAGO, FLAG1, FLAG2 and 
FLAG3 — allow single-bit signalling between processors. Many 
instructions can be conditioned on a flag's value, enabling efficient 
communication and synchronization between multiple processors or in 
other interfaces. Examples of flag use are included in the multiprocessing 
memory examples later in this chapter. 

9.5.1 Flag Direction 

The flags are bidirectional pins, each with the same functionality. Whether 
or not each flag is an input or an output is controlled by bits in the 
MODE2 register. The control for each flag is independent. On reset, the 
MODE2 register is cleared, so all the flags are inputs. 


MODE2 

Bit 

Name 

Definition 

15 

FLGOO 

FLAGO l=output; 0=input 

16 

FLGIO 

FLAG1 l^output; 0=input 

17 

FLG20 

FLAG2 l=output; 0=input 

18 

FLG30 

FLAG3 l=output; 0=input 

9.5.2 

Flag Input 



When a flag pin is programmed as an input, its value is stored in a bit in 
the AST AT register. These flag bits are not changed when the AST AT 
register is pushed onto or popped off the status stack. The bit is updated 
each cycle with the input value from the pin. Flag inputs can be 
asynchronous to the ADSP-21020 clock, so there is a one-cycle delay 
before a change on the pin appears in the ASTAT bit if the rising edge of 
the flag input misses the setup requirement for that cycle. The states of 
ASTAT flag bits are conditions that you can specify in conditional 
instructions. 


ASTAT 


Bit 

Name 

Definition 

19 

FLGO 

FLAGO value 

20 

FLG1 

FLAG1 value 

21 

FLG2 

FLAG2 value 

22 

FLG3 

FLAG3 value 


An ASTAT flag bit is read-only if the flag is configured as an input. 
Otherwise, the bit is readable and writeable. 




9.5.3 Flag Output 

When a flag is an output pin, the value on the flag pin follows the data bit 
in the AST AT register. These flag bits are not changed when the AST AT 
register is pushed onto or popped off the status stack. A program can set 
or clear the AST AT flag bit to provide a signal to another processor or 
peripheral. The timing of a flag output is shown in Figure 9.3. 
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Figure 9.3 Flag Output Timing 

9.6 MEMORY CONFIGURATIONS 

The ADSP-21020 provides memory management and interface features to 
support a variety of memory configurations. These features are described 
in detail in Chapter 6. This section presents examples of several 
configurations: systems based on a single ADSP-21020 processor and 
systems based on multiple ADSP-21020s. These examples may serve as a 
starting point for your hardware design. 

9.6.1 Single Processor Configurations 

The memory configurations in this section are based on a single ADSP- 
21020 processor. The examples range from simple to complex. These 
examples show general concepts only; the details of a specific design 
would depend on the application. 
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9.6 . 1. 1 One Memory Bank 

Figure 9.4 shows the ADSP-21020 connected to a single bank of data 
memory. Each memory device is a 15 ns static RAM (SRAM), which 
allows the ADSP-21020 access with zero wait states. (Refer to the 
ADSP-21020 Data Sheet for specific timing parameters.) The memory 
devices are each 4 bits wide by 64K locations. The 64K locations require 
only the lower 16 address bits of the ADSP-21020; the upper bits are 
unused. 

In this example, eight memory devices are used to provide 32-bit width 
for IEEE standard floating-point data. (If extended 40-bit data were 
needed, 10 devices would be used.) The lower eight bits of the data bus 
(D7-0) are not connected. No pullup or pulldown resistors are needed on 
these unused pins — this is taken care of on-chip. The ADSP-21020 drives 
and reads these lower eight bits even when it is internally configured for 
32-bit data (the computation units ignore the lower eight bits). 

The DMWR output of the ADSP-21020 controls each memory device's 
write enable (WE), and the DMRD output controls each memory device's 
output enable (OE). The DMSO signal, which in this case is the only 
memory select ever activated, is connected to the chip enable (CE) of each 
device. This particular memory device has a second chip enable, which is 
not used (tied low) in this example. 


ADSP-21020 
30 ns 


DMD39-8| 

DMA15-0 


DMWRr 

dmrd| 

DMSO 




£ 

£ 




SRAM 
64K x 4 
15 ns 


A15-0 

WE 

OE 

CE1 

CE2 


Figure 9.4 Interface to Single Data Memory Bank 


9.6. 1.2 Several Memory Banks 

Three banks of data memory are connected to the ADSP-21020 in the 
example in Figure 9.5. As in the previous example, each memory device is 
a 35 ns SRAM, for zero wait states. The memory devices are each 8 bits 
wide by 32K locations, for a total of 96K locations. Five devices in each 
bank are needed for 40-bit floating-point data. 

Bank 0 extends from 0x0000 to 0x7FFF; bank 1 from 0x8000 to OxFFFF; and 
bank 2 from 0x10000 to 0xl7FFF. However, only the lower 15 address bits 
of the ADSP-21020 are needed for addressing the 32K locations in each 
bank because the data memory selects enable only one bank at a time. 
DM50, DMS1, and DM52 are connected to the chip enables (CE) of banks 
0, 1 and 2, respectively. The DM55 memory select, not used in this 
example, could be used to select a fourth bank. 

The DMWR output of the ADSP-21020 controls each memory device's 
write enable (WE), and the DMRD output controls each memory device's 
output enable (OE). 



Figure 9.5 Interface to Three Data Memory Banks 








9.6. 1.3 Memory & I/O Devices 

The example in Figure 9.6 is identical to the previous one but shows 
memory-mapped I/O devices added. These devices, one input-only and 
one output-only, can be mapped to any location in bank 3 (any location 
greater than 0xl7FFF in this example), since there are no other devices in 
that bank. The DMS3 signal selects the I/O devices for access; the read 
and write strobes differentiate between the two, enabling the latch of the 
output port on writes and the buffer of the input port on reads. 

I/O devices should be connected to the 32-bit integer field (the upper 32 
bits) of the DMD or PMD data buses — bits 39-8 of the DMD bus, and bits 
47-16 of the PMD bus. 



Figure 9.6 Interface to Three Data Memory Banks and Two I/O Devices 
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9.6 . 1.4 Hardware Acknowledge 

Figure 9.7 shows the interface to a relatively slow I/O device that uses the 
hardware acknowledge (DMACK, in this case) to insert wait states in the 
memory cycle. This device is mapped to data memory bank 3. No other 
devices are in the same bank, so the DMS3 signal can be used to gate the 
strobe that enables the I/O device to read or write data (N bits). 



Figure 9.7 I/O Device Interface with Hardware Acknowledge 

The latch on the RDY output of the I/O device synchronizes the signal to 
the ADSP-21020. This latch is not needed if the RDY signal meets the 
setup requirement for DMACK. 

If the I/O device needs extra time to deassert RDY after an access is 
initiated (so that DMACK will not be erroneously sampled high), the 
ADSP-21020 can be programmed to require both internal wait states and a 
hardware acknowledge to terminate the memory cycle. The programmed 
wait states give the I/O device extra cycles in which to deassert RDY. But 
the RDY signal will still determine the end of the memory cycle. 




9.6. 1.5 Cache Memory 

Figure 9.8 shows how an external cache controller would use the three- 
state enable (DMTS, for data memory). The cache controller monitors the 
ADSP-21020 address to detect a cache miss. When a miss occurs, the 
controller asserts DMTS in time to prevent the ADSP-21020 from 
completing the access. The ADSP-21020 places the data memory interface 
in a high-impedance state and halts. This allows the cache controller to 
retrieve the needed data from main memory and load it into the cache. It 
then deasserts DMTS and deasserts DMACK for one cycle to allow the 
ADSP-21020 to complete the memory access. 

The 20 k Q pullup resistors on DMRD, DMWR and DM53 are needed to 
hold these controls inactive (high) during the transition of control between 
the ADSP-21020 and the cache controller. 



Figure 9.8 Cache Memory Interface 



If the cache memory requires the ADSP-21020 to use wait states, the 
processor can be programmed to recognize the AND of internal and 
external acknowledges as the terminator of the memory cycle. After the 
controller updates the cache, it holds DMACK low for the required 
number of wait states plus one, the extra cycles allowing for the 
completion of the access that caused the cache miss. The minimum timing 
for DMACK is shown in Figure 9.9. 
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Figure 9.9 Timing on Cache Miss 


9 . 6 . 1. 6 DRAM With Paging 

The example in Figure 9.10 shows the ADSP-21020 interface to a page- 
mode or static-column DRAM using a DRAM controller (which may be 
implemented with PALs and P G Ask T his example is similar to the cache 
memory example in that it uses DMTS to hold off the memory access 
while an external memory controller takes over. 
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The DRAM controller's outputs are normally tristated. The DMPAGE 
output signals a change of page to the controller. This can be accompanied 
by an automatic wait state if the controller requires extra time. The DRAM 
controller responds by latching in the address and by asserting DMTS to 
prevent the ADSF-21020 from completing the access. It then controls the 
DRAM to effect the page change, using the latched address, by driving the 
10 MSBs of the address onto the 10 LSBs of the DRAM address input. 
When finished, the DRAM controller tristates its memory controls and 
deasserts DMTS and deasserts DMACK in the same cycle for the 
appropriate number of wait state cycles plus one. This timing is the same 
as for the cache controller, shown in Figure 9.9. 



Figure 9.10 Page-Mode DRAM Interface 
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9.6.1.7 Direct Memory Access (DMA) 

The example shown in Figure 9.11 uses the bus request/bus grant 
protocol of the ADSP-21020 to perform direct memory access (DMA) on 
the ADSP-21 020's data memory. The DMA controller is driven by a host 
processor (not shown) that occasionally needs to access the data memory 
to read or write a buffer of data. When the host requests an access, the 
DMA controller asserts bus request (EK) on the ADSP-21020. The ADSP- 
21020 completes its current instruction, places its memory buses in a high 
impedance state, and asserts bus grant (BG). The ADSP-21020 idles while 
HR is asserted. 

The bus grant allows the DMA controller to access the data memory, 
using counters to generate the necessary addresses and clocking the data 
to or from the host through the bidirectional latch. The DMA controller 
must also provide the appropriate memory strobes. 



Figure 9.11 DMA Controller Interface Using Bus Request 





When the DMA access is complete, the controller deasserts BR and the 
ADSP-21020 continues program execution from where it left off. Timing 
for the bus request/bus grant cycle is shown in Figure 9.12. Note that 
there is at least one cycle of delay after BR is asserted and before BG goes 
low (more if the ADSP-21020 is executing an instruction requiring extra 
cycles). After BG goes low, there may be one cycle of overhead during 
which no instructions are executed and no data is transferred. There may 
be another cycle of overhead when exiting bus grant if the DMA controller 
cannot tristate its outputs before the ADSP-21020 drives the bus. In this 
case, the controller must tristate its outputs in the previous cycle. 
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Figure 9.12 Bus Request Timing for DMA 



9.6.2 Multiprocessor Configurations 

This section describes several memory configurations featuring multiple 
ADSP-21020s operating in the same system. In these examples, the 
processors pass data between one another through shared memory. The 
configuration appropriate for a specific application depends on the data 
flow and timing required by the system. 

9.6.2 . 1 Multiport Memory 

Figure 9.13 shows the minimal hardware for connecting three ADSP- 
21020s and a host processor to a 4-port RAM. The particular memory 
device in this example is 8 bits wide by 2K locations. Four devices are 
needed for 32-bit data in Figure 9.13; the number of devices actually used 
depends on the data width to be supported. 

The memory provides four identical interfaces consisting of address, data, 
write strobe, output enable, and chip enable. In this example, each 
processor maps accesses to this memory to bank 1 of data memory 
(enabled by DMSl). Only 11 of each processor's 32 address lines are 
needed to address the 2K locations. Also shown for each processor is local 
data memory that may use all of the address lines. The local memory 
would occupy a different bank of memory and thus be enabled by a 
different memory select signal. 





Figure 9.13 Four-Port RAM Configuration 
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9.62.2 Serial Data Flow 

If the flow of data in a multiprocessor system is serial, that is, data moves 
in sequence along a linear path from one processor to the next, then 
several configurations are suggested. 

Buffer Latches 

Figure 9.14 shows a simple, low-cost solution for synchronous transfers. 
The linkage between every two processors in a serial path consists of a set 
of buffer latches, the number of which is determined by the width of the 
data to be transferred (8, 16, 24, 32 or 40 bits). Each processor in the serial 
path outputs data to latches on its DMD bus, and the reads data from 
latches on its PMD bus. Each processor also has local program memory 
and data memory. Reads from and writes to the latches are distinguished 



DMD 


PMD 


Figure 9.14 Serial Data Flow Using Buffers 
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by the bank 1 selects (DMS1 and PM5T) in this case. Data is clocked into a 
latch by the rising edge of the DMWR signal; the output of a latch is 
enabled by the PMRD signal. 

Two of each processor's flags (FLAGO and FLAG1, programmed as 
inputs) are used for synchronizing the data transfer between processors. 
When one processor writes to the latch, it asserts the external semaphore 
which in turn asserts the FLAG1 input of the next processor to indicate 
that there is data in the latch. When the read occurs, the second processor 
resets the signal on its FLAG1 input and also sets the FLAGO input of the 
first processor to indicate that the data has been read. The first processor 
then writes the latch again, resetting the signal on its FLAGO input. 

A processor reads the latch only if FLAG1 indicates that there is data to 
read. The instruction would be a conditional instruction that reads 
program memory: 

IF FLAG1__IN F3=PM (buffer) ; {"buffer" is a location } 

{ in bank 1 } 

Likewise, a processor would write the latch only if FLAGO indicates that 
the previous data has been read. 

In this configuration, a processor can read the latch in the cycle after it was 
written. The maximum throughput is therefore one data transfer every 
two cycles. High transfer rates, however, require the operations of all 
processors in the system to be closely synchronized, since there is no 
external storage in which to accumulate data. 

The use of both memory interfaces is optional. Alternatively, both input 
and output latches could be placed on the data memory interface. In this 
case, the program memory interface would be used only for program 
memory accesses. 
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FIFOs 

Figure 9.15 shows an example similar to the previous one, except that the 
buffer latches are replaced with FIFOs. This configuration has the 
advantage that each processor does not have to wait for the next one to 
read data before it can write data. The processors can operate at full speed, 
and bottlenecks are avoided. 



Figure 9.15 Serial Data Flow Using FIFOs 
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In this configuration, a low FLAGO input indicates that the FIFO is full 
(FF flag is asserted). Thus, writes to the FIFO should be conditioned on 


FLAGO: 


IF FLAG0__IN DM ( f if o) =F7 ; 

or 


{"fifo" is a location in bank 1} 


DO loop UNTIL NOT FLAG0_IN; 

compute , DM(fifo)=F7; 
instruction 2 ; 
instruction 3 ; 
instruction 4 ; 
loop: instruction 5 ; 


{loop is at least 5 instructions, so flag } 
{ can go low before loop restarts. } 
{"fifo" is a location in bank 1} 

{FLAGO is low here on last loop} 


Similarly, when the FIFO is empty it asserts its EF flag which deasserts the 
FLAG1 input of the processor receiving data. FIFO reads should be 
conditioned on FLAG1. 

As in the previous example, the use of the program memory interface is 
optional; both the input FIFO and output FIFO can be connected to a 
processor's data memory interface. 
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Dual-Port Memory 

Figure 9.16 shows an example using dual-port RAM to transfer data 
between processors. This configuration is similar to the previous one, but 
has even more data storage and allows bidirectional data flow on the 
serial path. 

The INT pin is a general-purpose output that is set when location 0x3FE or 
0x3FF of the dual-port RAM is written. These locations are mailbox 
registers that can be used to pass messages. In this example, the INT 
output is connected to a flag input on the ADSP-21020. The processor can 
read the mailbox register if the flag is set. Alternatively, INT could trigger 
an interrupt, with the service routine reading the mailbox. 



Figure 9.16 Serial Data Flow Using Dual-Port RAM 
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The dual-port RAM has a BUSY output that it asserts if contention occurs 
(both processors try to write the same location at the same time). If both 
processors are operating with zero wait states, however, this function can 
not be used. Contention can be avoided if each processor is constrained in 
software to read only those locations that the other processor writes and to 
write only those locations that the other processor reads. In this way, two 
writes to the same location will never occur. 


9.7 PROGRAM MEMORY BOOT AT RESET 

After reset, the ADSP-21020 automatically fetches its first instruction from 
location 0x08 of program memory. If program memory consists of ROM, 
the instructions are available in memory at powerup. If program memory 
is made up of RAM, however, there must be a mechanism for loading the 
program into memory at powerup. A single RAM and no ROM is 
frequently an attractive option because the addition of 8-bit ROMs would 
require six more memory devices, resulting in a higher cost, more board 
space and higher capacitive loads on the address lines than with RAM 
alone. 

In a boot operation the ADSP-21020 executes a minimal program that 
loads the rest of program memory. A way to implement the boot 
operation is to load the instructions of a boot program through the ADSP- 
21 020's test access port. This port, which conforms to the IEEE 1149.1 
specification, is described in Appendix C. It allows serial data to be shifted 
into and out of the ADSP-21020. The internal serial path connects to every 
input and output pin, so that the value of any pin can be read or written to 
using the serial shift mechanism. 

The boot operation can be controlled by a host processor or a dedicated 
microcontroller. The operation would proceed generally as follows: 

1. The host or controller shifts an instruction and address into the 
program memory data inputs with PMWR deasserted. 

2. The host or controller shifts the same instruction and address into the 
program memory data inputs with PMWR asserted. 

3. The host or controller shifts the same instruction and address into the 
program memory data inputs with PMWR deasserted. 
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4. Steps 1-3 are repeated for a series of instructions and sequential 
addresses that cause the processor to load a loader program into 
program memory. 

5. When the loader program has been loaded, RESET is deasserted, and 
the processor begins executing the loader program from location 0x08 
in memory to bring in the main program. 

An example loader program is shown below in Figure 9.17. In this 
example, data is read in byte- wise on DMD15-8. The 48-bit instructions 
are reconstructed in the shifter, then transferred to the PMD bus and to 
program memory using the PX registers. This routine assumes that data is 
coming from an 8-bit EPROM. If a host is supplying the data, the address 
lines can be ignored. This routine requires 18 instructions. 


1 8 = S T ART_AD R ; 
M8=l ; 

L8=0 ; 

1 1 =EP ROM_ADR ; 
Ml=l ; 

L1=0 ; 


{load program at START__ADR} 
{M8 increments by 1} 

{no circular buffer} 
{pointer to EPROM} 

{Ml increments by 1} 

{no circular buffer} 


LOAD LP: 


LCNTR=LOAD_COUNT , DO LOAD_LP UNTIL LCE ; 
R0=DM ( II , Ml ) ; 

R1=FDEP R0 BY 0:8, R0=DM ( II , Ml ) , 
R1=R1 OR FDEP R0 BY 8:8, R0=DM(I1,M1) 
PX1=R1 ; 

R1=FDEP R0 BY 0:8, R0=DM(I1,M1) 
R1=R1 OR FDEP R0 BY 8:8, R0=DM(I1,M1) 
R1=R1 OR FDEP R0 BY 16:8, R0=DM(I1,M1) 
R1=R1 OR FDEP R0 BY 24:8; 

PX2=R1 ; 


PM ( 18 , M8 ) =PX; {write instr. to PM} 

JUMP START_ADR; {begin execution} 


Figure 9.17 Example Loader Program 


{load byte 1 LSB} 
{ load byte 2 } 

{ load byte 3 } 

{ load byte 4 } 

{ load byte 5 } 
{load byte 6 MSB} 
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9.8 MEMORY INTERFACE CAPACITIVE LOAD 

The timing parameters for the memory interfaces of the ADSP-21020 are 
specified at capacitive loads of 100 pF on the address, memory select, page 
boundary, read strobe and write strobe outputs. For the data and other 
outputs, the nominal load is 50 pF. If the capacitive load on an output is 
different than the nominal load, its switching characteristic is affected. 
Specifically, as the capacitance on a pin increases, so do its rise and fall 
times. If an output delay is measured at the point that the output reaches a 
particular voltage level, then the delay increases for larger capacitive loads 
and decreases for smaller ones. Consequently, pin-to-pin variations in 
capacitive loading can alter the relative timing of outputs to the point 
where they no longer meet the input requirements of the memory device. 

The drive strength of the ADSP-21020 memory outputs is sufficient for 
large capacitive loads, so most variations in loading will not change 
relative timing enough to violate any memory device specifications. This 
section describes how to determine whether load variations in your 
system will cause timing problems and how to correct them. 

9.8.1 Load Variations 

A typical application that can contain large variations in loading between 
pins is one that requires multiple banks of memory. Figure 9.18 (on page 
9-28) shows an example with three banks of external data memory, one of 
which is 32K words long and the other two which are 8K words each. 
Address lines DMA12-0, DMRD and DMWR are loaded by all three banks 
of memory, a total of 15 RAMs. The address lines DMA13 and DMA14 
and memory selects DMS2-0 are connected to only one of the three 
memory banks and have approximately a third the load of the other 
outputs. 

In this scenario, there are a number of RAM specifications that can be 
negatively affected by the load variation. One of these is the address hold 
from write end (write strobe deasserted), typically specified at 0 ns 
minimum. The timing of the ADSP-21020 data memory address and 
DMWR outputs is specified to guarantee a positive hold time at nominal 
capacitive loads. However, if the load on an address line is much less than 
that on the write strobe, which is the case for DMA13 and DMA14, the 
address line switches faster relative to the strobe. If the result is that the 
address line switches before the strobe does, then the address hold time is 
negative, and the RAM input requirement is not met. This situation is 
depicted in Figure 9.19. 
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There are other capacitive loading variations that can lead to violations of 
RAM specifications. If data lines are more heavily loaded than the write 
strobe, for example, the data may not be valid in time to meet the setup 
requirement before the write end. Heavily loaded address lines can be 
slowed enough to adversely affect the address-to-read-data-valid and 
address-to-acknowledge requirements. 

To determine whether capacitive loading variations in a particular 
situation lead to specification violations, refer to graphs in the 
ADSP-21020 Data Sheet that specify for each type of output how the delay 
changes with capacitive load. Adjust the nominal delay for each output 
based on its capacitive load, then assess whether the relative timing of 
signals meets the input specifications of the RAM. 



Figure 9.18 Memory Configuration with Unequal Capacitive Loads 
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Figure 9.19 Effect of Unequal Capacitive Loads 
$.8.2 Correcting The Timing 

Timing violations caused by load variations can be corrected by slowing 
down the faster outputs. Two ways to accomplish this are adding discrete 
capacitance or adding series resistance to the faster outputs. Each method 
results in a longer delay, but discrete capacitance may add ringing 
whereas series resistance may increase susceptibility to noise coupling. If 
series resistance is chosen, the following formula yields the approximate 
amount of resistance required: 

R = At / Cioad 

Thus, to increase the delay by 2 ns on an output that has a 40 pF load, add 
about 50 £2 of resistance in series between the output and the RAM input, 
near the output pin. Access times may suffer if too much delay is added, 
however. 
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9.9 EZ-ICE EMULATOR CONSIDERATIONS 

The ADSP-21020 EZ-ICE® Emulator is a development tool for 
debugging programs running in real time on your ADSP-21020 or 
ADSP-21010 target system hardware. 

The EZ-ICE provides a controlled environment for observing, 
debugging, and testing activities in a target system by connecting 
directly to the target processor through its JTAG interface. The 
emulator can monitor system behavior while running at full speed. It 
lets you examine and alter memory locations as well as processor 
registers and stacks. 

Because EZ-ICE controls the target system's ADSP-21020 (or ADSP- 
21010) through the processor's IEEE 1149.1 (JTAG) Test Access Port, 
non-intrusive in-circuit emulation is assured; the emulator does not 
impact target loading or timing. The emulator's in-circuit probe 
connects to an IBM PC host computer with an ISA bus plug-in board. 

Target system boards must have an 11 -pin JTAG connector to accept 
the EZ-ICE's in-circuit probe (a 12-pin plug). 

9.9.1 Target Board Connector For EZ-ICE Probe 

The EZ-ICE uses the IEEE 1149.1 JTAG test access port of the ADSP- 
21020/21010 to monitor and control the target board processor durin g 
emulation. The EZ-ICE probe requires that CLKIN, TMS, TCK, TRST, 
TDI, TDO, and GND be made accessible on the target system via a 12- 
pin connector (pin strip header) such as that shown in Figure 9.20. The 
EZ-ICE probe plugs directly onto this connector for chip-on-board 
emulation; you must add this connector to your target board design if 
you intend to use the EZ-ICE. 

The 12-pin, 2-row pin strip header is keyed at the pin 1 location — you 
must clip pin 1 off of the header. The pins must be 0.025 inch square 
and at least 0.318 inch in length. Pin spacing is 0.1 x 0.1 inches. Pin 
strip headers are available from vendors such as 3M, Samtec, and 
McKenzie. 

The tip of the pins must be at least 0.18 inch higher than the tallest 
component on your board to allow clearance for the bottom of the 
emulator's probe. 
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The length of the traces between the EZ-ICE probe connector and the 
ADSP-21020/21010's test access port pins should be less than 1 inch. 


The BMTS, BTCK, BTRST, and BTDI signals are provided so that the 
test access port can also be used for board-level testing. When the 
connector is not being used for emulation, place jumpers between the 
BXXX pins and the XXX pins as shown in Figure 9.20. If you are not 
going tp use the test access port for board test, tie BTRST to GND and 
tie or pullup BTCK to VDD. The TRST pin must be asserted (pulsed 
low) after powerup (through BTRST on the connector) or held low for 
proper operation of the ADSP-21 020/21 010. 

9.9.2 Other Hardware Considerations 

t The EZ-ICE probe adds two TTL loads to the CLKIN pin of the 
ADSP-21020/21010. 

• The target system design must use PMRD and DMRD to gate the 
output enable of any device that can drive the ADSP-21020/21010 


buses, 
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Figure 9.20 Target Board Connector For EZ-ICE Probe (jumpers in place) 



9.10 HOST PROCESSOR INTERFACE 

In this section a bidirectional interface between an ADSP-21020 and a host 
microprocessor is described. The interface consists of three channels: the 
write channel, the read channel, and the status channel. The write channel 
transfers data from the host to the ADSP-21020. The read channel transfers 
data from the ADSP-21020 to the host. The status channel provides the 
host with information regarding the state of the read and write channels. 

The system configuration is shown in Figure 9.21. Figure 9.22 shows the 
details of the interface logic. 

When the host writes data to the port, the write channel valid flag (WCV) 
goes active. This flag informs the ADSP-21020 that valid data is present in 
the write channel. Similarly, when the ADSP-21020 writes data to the port, 
the read channel valid flag (RCV) goes active. Both flags are cleared when 
their respective channels are read. It is the channel valid flags that the host 
accesses when it reads the status channel. 

The channel valid flags are set and cleared on the rising edges of the 
strobes. This ensures that the flags reflect the true state of the channels at 
all times. For example, assume the flags are set on the falling edges of the 
strobes and that the host is writing data to the port. If the ADSP-21020 is 
much faster than the host, it may sense the flag and read the channel 
before the host has had time to put its data into the port. By changing the 
state of the flags on the rising edge of the strobes, we guarantee that the 
flags change state only after the channels do. 

9.10.1 Data Transfer Sequences 

Described below are the two basic transfer sequences. See the timing 
shown in Figures 9.23 and 9.24 for more information. In these figures, the 
falling edges of WCV and RCV indicate the presence of valid data in the 
host port latches. The rising edges indicate that the data has been read and 
the buffer is empty. The data bus shows the host port latch data. 





OCTAL POSITIVE EDGE TRIGGERED REGISTER 


Dashed lines indicate optional connections. 

Figure 9.21 Host Interface Block Diagram 
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DMOE 


DMCK 



HCK 


HOE 


NOTES: 

1 . The latches shown are positive edge triggered D latches with asynchronous reset to 0. 

2. Because the host acknowledge line (ACK) is shared by the memory system, HACK is 
tristated except when either of the host strobes is low. 


Figure 9.22 Host Interface Logic 





Host Write (Host Data to ADSP-21020): 


1. The host writes data to the port. The WCV flag is set on the rising edge 
of the host write strobe. 

2. The ADSP-21020 samples the WCV flag, either by polling or interrupt. 

3. The ADSP-21020 reads the port, clearing the WCV flag. 

Because the WCV flag is cleared by the ADSP-21020 read before the host 
begins its write, the HACK line is asserted immediately after the host 
write strobe is asserted, ending the host cycle without wait states. If the 
host attempts to write the port again before the ADSP-21020 reads it, the 
HACK line is deasserted immediately after the write strobe is asserted and 
remains deasserted until the ADSP-21020 reads the data. Then the HACK 
line is asserted, ending the host cycle. The previous host data is not 
corrupted by the second host write because the data is clocked in on the 
rising edge of the write strobe. The host can avoid hanging up by reading 
. the status bits before each transfer. 



Figure 9.23 Host Write Timing 
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Host Read (ADSP-21020 Data to Host): 

1. The ADSP-21020 writes data to the port, setting the RCV flag on the 
rising edge of the write strobe. 

2. The host samples the RCV flag, either by being interrupted, or by 

reading the status channel. __ 

3. The host reads the port, clearing the RCV flag on the rising edge of the 
read strobe. 

Because the RCV flag is set by the ADSP-21020 write before the host 
begins its read, the HACK line is asserted immediately after the host read 
strobe is asserted, ending the host cycle without wait states. If the host 
attempts to read the port before new data has arrived, the HACK line is 
deasserted immediately after the host read strobe is asserted until the 
ADSP-21020 writes data to the port. Then the HACK line is asserted, 
ending the host cycle. 


Since this is an asynchronous interface, the status of the channels may 
change even as the host is reading the status. At worst, this could cause 
the host to wait longer than necessary before initiating the next access. 



Figure 9.24 Host Read Timing 
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9.1 0.2 Host Interface Code Examples 

This section describes several ways to program data transfers using the 
host port. 

9.10.2.1 Buffer Transfers 

The code segment below shows buffer transfers from the host while the 
ADSP-21020 is executing a loop. It has minimal overhead for a transfer, 
and the loop length is the same whether or not there is a transfer. 

LCNTR= length, DO END UNTIL LCE; {Loop} 

IF NOT FLAGO R3=DM(HOST), R1=LEFTZ Rl ; {Read host if not f lagO } 
IF SZ DM ( I 0 , MO ) =R3 , R2=LEFTZ R2 ; {Write buffer if SZ=1} 

instruction 1 ; {Main part of loop} 

instruction 2 ; 


END: instruction N ; 

The 10 register is the address pointer to the buffer in data memory. FLAGO 
is low when data is ready to be read by the ADSP-21020. Rl is initialized 
to OxFFFFFFFF, R2 to 0. The use of the SZ flag permits the read from the 
host and the write to the buffer to be indivisible. If the shifter, which tests 
Rl and R2 and sets and clears the SZ flag, is used elsewhere in the loop, 
the SZ flag should not be left set prior to the FLAGO test. 

9.10.2.2 Interrupt-Driven Transfers 

Interrupt-driven transfers are useful because they allow the ADSP-21020 
to continue processing without overhead until data is ready to be 
transferred. This is desirable when communicating with a slow host. 

The code below is a simple example of an interrupt-driven I/O driver. 

transfer: RTI (DB) ; {Return (delayed) } 

R15=DM (HOST_ADDR) ; {Get data from port} 

DM (16, 1) =R15; {Transfer data to buffer, incr pointer} 

The transfer interrupt request is asserted by the host interface logic each 
time the host writes to the port. An alternative to a host interrupt is a 
timer interrupt (using the ADSP-21020 timer). 
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9.10.2.3 High Speed Transfers 

For transfer rates comparable to the ADSP-21020 cycle time, the host 
interface should be connected to the ADSP-21 020's program memory port. 
Then data can be read from the host port and written to the data memory 
on every cycle. This may even occur in parallel with a computation. 


init : 

I5=H0ST_ADDR; 

{Load addr pointer 

to 

host port} 


M5=0 ; 

1 6=DATA__BUFFER; 

M6=l ; 

{Load addr pointer 

to 

buffer } 

go: 

R15=PM(I5, 0) ; 

{ Get 

first data 


LCNTR= length, DO end 

UNTIL LCE ; 



end: 

compute , DM(I6,M6) = 

=R15, R15=PM (15, M5 ) ; 

{Write data. 


{then get data} 

DM ( 16, M6) =R15; {Write last data} 

The dual fetch instruction in the loop transfers R15 to data memory first, 
then loads R15 from program memory (the host port). In effect, it writes 
the last value to data memory while reading the next value from the host 
port. That is why R15 is loaded from the port prior to entering the loop, 
and why R15 is written to data memory after the loop. 

If transfers at the full clock rate are desired, then the host and ADSP-21020 
should be synchronized (i.e., use the same clock). DMACK may be used to 
control throughput in this case. 

If a handshake with the host is needed, as shown in Figures 9.23 and 9.24, 
then the maximum transfer rate will be half the ADSP-21020 clock rate. 
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A.1 OVERVIEW 

This appendix and the next one describe the ADSP-21 020/21 010 
instruction set in detail. This appendix explains each instruction type, 
including the assembly language syntax and the opcode that the 
instruction assembles to. Many instruction types contain a field for 
specifying a compute operation (an operation that uses the ALU, 
multiplier or shifter). Because there are a large number of options 
available for this field, they are described separately in Appendix B. 

(Note that data moves between the MR registers and the register file are 
considered multiplier operations.) 

Each instruction is specified in this section. The specification shows the 
syntax of the instruction, describes its function, gives one or two 
assembly-language examples, and specifies and describes the various 
fields of its opcode. The instructions are grouped into four categories: 

I. Compute and move or modify instructions, which specify a compute 
operation in parallel with one or two data moves or an index register 
modify. 

II. Program flow control instructions, which specify various types of 
branches, calls, returns and loops. Some of these instructions may also 
specify a compute operation and/or a data move. 

III. Immediate data move instructions, which use immediate instruction 
fields as operands, or use immediate instruction fields for addressing. 

IV. Miscellaneous instructions, such as bit modify and test, no operation 
and idle. 

Many instructions can be conditional. These instructions are prefaced by 
an "IF" plus a condition mnemonic. In a conditional instruction, the 
execution of the entire instruction is based on the specified condition. 



Several sections that appear before the instruction specifications explain 

the notation conventions used in this instruction set reference (for both 

Appendix A and Appendix B). 

• Section A.2 describes the notation and abbreviations used in the syntax 
description for each instruction. 

• Section A.3 describes the notation and abbreviations used in the 
opcode description for each instruction. 

• Section A.4 lists all condition and termination codes and their 
assembly language mnemonics. 

• Section A.5 lists the assembly language mnemonics and opcode 
addresses for all universal registers. 


A.2 INSTRUCTION SYNTAX NOTATION 

The conventions in this section are used to describe the syntax of each 

instruction. 


Notation 

Meaning 

UPPERCASE 

explicit syntax; assembler keyword 
instruction terminator 

, 

separates parallel operations in an instruction 

italics 

optional part of instruction 

1 between lines 1 

list of options (choose one) 

<data n> 

n-bit immediate data value 

<addni> 

n-bit immediate address value 

<reladdrn> 

n - bit immediate PC-relative address value 

<bit6>:<len6> 

6-bit immediate bit position and length values 
(for shifter immediate operations) 

compute 

ALU, multiplier, shifter or multifunction operation 
(see Appendix B) 

shiftimm 

shifter immediate operation (see Appendix B) 

condition 

status condition (see Condition Codes) 

termination 

termination condition (see Condition Codes) 

ureg 

universal register (see Universal Registers) 

sreg 

system register (see Universal Registers) 

dreg 

R15-R0, F15-F0; register file location 

Rn, Rx, Ry, Ra, Rm, Rs 

R15-R0; register file location, fixed-point 

Fn, Fx, Fy, Fa, Fm, Fs 

F15-F0; register file location, floating-point 

R3-0 

R3, R2, Rl, RO 

R7-4 

R7, R6, R5, R4 

Rll-8 

Rll, RIO, R9, R8 



Notation 

Meaning 

R15-12 

R15, R14, R13, R12 

F3-0 

F3, F2, FI, FO 

F7-4 

F7, F6, F5, F4 

FI 1-8 

FI 1, F10, F9, F8 

F15-12 

FI 5, F14, FI 3, FI 2 

la 

17-10; DAG1 index register 

Mb 

M7-M0; DAG1 modify register 

Ic 

115-18; DAG2 index register 

Md 

M15-M8; DAG2 modify register 

(DB) 

Delayed branch 

(LA) 

Loop abort (pop loop, PC stacks on branch) 

MROF 

Multiplier result accumulator 0, foreground 

MR1F 

Multiplier result accumulator 1, foreground 

MR2F 

Multiplier result accumulator 2, foreground 

MROB 

Multiplier result accumulator 0, background 

MR1B 

Multiplier result accumulator 1, background 

MR2B 

Multiplier result accumulator 2, background 


A.3 OPCODE NOTATION 

In ADSP-21020 opcodes, some bits are explicitly defined to be zeros or 
ones. The values of other bits or fields set various parameters for the 
instruction. The terms in this section define these opcode bits and fields, 
Bits which are unspecified are ignored when the processor decodes the 
instruction, but are reserved for future use. 

A Loop abort code 

0 Do not pop loop, PC stacks on branch 

1 Pop loop, PC stacks on branch 

ADDR Immediate address field 


Computation unit register 

0000 

MROF 

0001 

MR1F 

0010 

MR2F 

0100 

MROB 

0101 

MR1B 

0110 

MR2B 


B 

BOP 


COMPUTE 

COND 

CU 

DATA 

DEC 

DMD 

DMI 


Branch type 

0 Jump 

1 Call 


Bit Operation select codes 


000 

Set 

001 

Clear 

010 

Toggle 

100 

Test 

101 

XOR 


Compute operation field (see Appendix B) 
Status Condition codes 

0-31 (see Condition Codes) 
Computation unit select codes 

00 ALU 

01 Multiplier 

10 Shifter 

Immediate data field 

Counter decrement code 

0 No counter decrement 

1 Counter decrement 

Memory access direction 

0 Read 

1 Write 

Index (I) register numbers, DAG1 
0-7 
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DMM 


Modify (M) register numbers, DAG1 


DREG 

G 

INC 

J 

LPO 

LPU 

NUM 

OPCODE 

PMD 


0-7 


Register file locations 
0-15 

DAG /Memory select 

0 DAG1 or Data Memory 

1 DAG2 or Program Memory 

Coun ter increment code 

0 No counter increment 

1 Counter increment 

Jump Type 

0 Non-delayed 

1 Delayed 

Loop stack pop code 

0 No stack pop 

1 Stack pop 

Status stack push code 

0 No stack push 

1 Stack push 

Interrupt vector 


0-7 


Computation unit opcodes (see Appendix B) 
Memory access direction 

0 Read 

1 Write 


PMI 

PMM 

RELADDR 

SPO 

SPU 

SREG 

TERM 

U 

UREG 

RA, RM, RN, 
RS, RX, RY 


Index (I) register numbers, DAG2 
8-15 

Modify (M) register numbers, DAG2 
8-15 

PC -relative address field 
Status stack pop code 

r> \t„ — jl i 

VJ 1 \U SLdCJN pup 

1 Stack pop 
Loop stack push code 

0 No stack push 

1 Stack push 

System Register address 

0-15 (see Universal Registers) 
Termination Condition codes 
0-31 (see Condition Codes) 

Update, index (I) register 

0 Pre-modify, no update 

1 Post-modify with update 

Universal Register address 

0 - 256 (see Universal Registers) 

Register file locations for compute operands 
and results 

0-15 


A-6 



RXA 


RXM 


RYA 


RYM 


ALU x-operand register file location for multifunction 
operations 

8-11 

Multiplier x-operand register file location for 
multifunction operations 


0-3 


ALU y-operand register file location for multifunction 
operations 

12-15 

Multiplier y-operand register file location for 
multifunction operations 


4-7 


A.4 CONDITION CODES 


No. 

Mnemonic 

Description 

True If 

0 

EQ 

ALU equal zero 

AZ = 1 

1 

LT 

ALU less than zero 

[AF and (AN xor (AV and ALUSAT)) 
or (AF and AN and AZ)] = 1 

2 

LE 

ALU less than or equal zero 

[AF and (AN xor (AV and ALUSAT)) 
or (AF and AN) ] or AZ = 1 

3 

AC 

ALU carry 

AC = 1 

4 

AV 

ALU overflow 

AV = 1 

5 

MV 

Multiplier overflow 

MV = 1 

6 

MS 

Multiplier sign 

MN = 1 

7 

sv 

Shifter overflow 

SV = 1 

g 

sz 



darnel acxu 

SZ = 1 

9 

FLAGO IN 

Flag 0 input 

FIO = 1 

10 

FLAG1 IN 

Flag 1 input 

FI1 = 1 

11 

FLAG2 IN 

Flag 2 input 

FI2 = 1 

12 

FLAG3 IN 

Flag 3 input 

FI3 = 1 

13 

IF 

Bit test flag 

BTF = 1 

14 


Reserved 


15 

LCE 

Loop counter expired 
(DO UNTIL term) 

CURLCNTR = 1 

15 

NOT LCE 

Loop counter not expired 
(IF cond) 

CURLCNTR * 1 


Bits 16-30 are the complements of hits 0-14 


16 

NE 

ALU not equal to zero 

> 

N 

II 

o 

17 

GE 

ALU greater than or equal zero 

[AF and (AN xor (AV and ALUSAT)) 
or (AF and AN and AZ)1 = 0 

18 

GT 

ALU greater than zero 

[AF and (AN xor (AV and ALUSAT)) 
or (AF and AN)] or AZ = 0 

19 

NOT AC 

Not ALU carry 

AC = 0 

20 

NOT AV 

Not ALU overflow 

AV = 0 

21 

NOT MV 

Not multiplier overflow 

MV = 0 

22 

NOT MS 

Not multiplier sign 

MN = 0 

23 

NOT SV 

Not shifter overflow 

sv = o 

24 

NOT SZ 

Not shifter zero 

sz = o 

25 

NOT FLAGO IN 

Not Flag 0 input 

FIO = 0 

26 

NOTFLAG1JN 

Not Flag 1 input 

FI1 = 0 

27 

NOT FLAG2 JN 

Not Flag 2 input 

FI2 = 0 

28 

NOT FLAG3 _IN 

Not Flag 3 input 

FI3 = 0 

29 

NOT TF 

Not bit test flag 

BTF = 0 

30 


Reserved 


31 

FOREVER 

Always False (DO UNTIL) 

always 

31 

TRUE 

Always True (IF) 

always 
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truction Set Reference A 


A.5 

UNIVERSAL REGISTERS 



Map 1 registers: 



PC* 

program counter 

System Registers: 

PCSTK 

top of PC stack 

MODE1 

mode control 1 

PCSTKP 

PC stack pointer 

MODE2 

mode control 2 

FADDR* 

fetch address 

IRPTL 

interrupt latch 

DADDR* 

decode address 

IMASK 

interrupt mask 

LADDR 

loop termination address 

IMASKP 

interrupt mask pointer 

CURLCNTR current loop counter 

ASTAT 

arithmetic status 

LCNTR 

loop counter 

STKY 

sticky status 

R15 - RO 

register file locations 

USTAT1 

user status reg 1 

115 - 10 

DAG1 and DAG2 index registers 

USTAT2 

user status reg 2 

M15 - MO 

DAG1 and DAG2 modify registers 



L15 - L0 

DAG1 and DAG2 length registers 



B15 - BO 

DAG1 and DAG2 base registers 

* read-only 





System 


(b7=0) 


Registers 

\ 

\ b7b6b5b4 


i — 1 — -i 

b3 b2 bl bO 

Ny 0000 0001 0010 0011 

0100 0101 

0110 0111 


0000 

R0 

10 

M0 

L0 

B0 


FADDR 

USTAT1 

000 1 

R1 

11 

Ml 

LI 

Bl 


DADDR 

USTAT2 

00 10 

R2 

12 

M2 

L2 

B2 




00 11 

R3 

13 

M3 

L3 

B3 


PC 


0 100 

R4 

14 

M4 

L4 

B4 


PCSTK 


0 10 1 

R5 

15 

M5 

L5 

B5 


PCSTKP 


0 110 

R6 

16 

M6 

L6 

B6 


LADDR 


0 111 

R7 

17 

M7 

L7 

B7 


CURLCNTR 


1000 

R8 

18 

M8 

L8 

B8 


LCNTR 


100 1 

R9 

19 

M9 

L9 

B9 



IRPTL 

10 10 

R10 

no 

M10 

L10 

B10 



MODE2 

10 11 

R11 

111 

Mil 

L1 1 

Bl 1 



MODE1 

1100 

R12 

112 

M12 

LI 2 

B12 



ASTAT 

110 1 

R13 

113 

M13 

LI 3 

B13 



IMASK 

1110 

R14 

114 

M14 

L14 

B14 



STKY 

1111 

R15 

115 

M15 

LI 5 

B15 



IMASKP 



Figure A.1 Map 1 Universal Register Addresses 
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Map 2 registers: 

DMWAIT 

DMBANK1 

DMBANK2 

DMBANK3 

DMADR 

PM WAIT 

PMBANK1 

PMADR 

PX 

PX1 

PX2 

TPERIOD 

TrrvT TTVFT 

x \~v_y un X 


wait state and page size control for data memory 

data memory bank 1 upper boundary 

data memory bank 2 upper boundary 

data memory bank 3 upper boundary 

copy of last data memory address 

wait state and page size control for program memory 

program memory bank 1 upper boundary 

copy of last program memory address 

48-bit PX1 and PX2 combination 

bus exchange 1 (16 bits) 

bus exchange 2 (32 bits) 

timer period 



Figure A.2 Map 2 Universal Register Addresses 




Group I. 

Compute and Move Instructions 

1 . Parallel data memory and program memory transfers with register file, optional 

compute operation A- 12 

2. Compute operation, optional condition A-1 3 

3. T ransfer between data or program memory and universal register, optional 

condition, optional compute operation A-1 4 

4. PC-relative transfer between data or program memory and register file, 

optional condition, optional compute operation A-1 6 

5. T ransfer between two universal registers, optional condition, optional compute 

operation A-1 8 

6. Immediate shift operation, optional condition, optional transfer between data or 

program memory and register file A-20 

7. Index register modify, optional condition, optional compute operation A-22 


A- 11 


| Compute and Move 

compute / dreg^DM / dreg^PM 


Syntax: 


compute , 


DM(Ia, Mb) = dregl 
dregl = DM (la, Mb) 


PM(Ic, Md) - dreg2 ; 
dreg2 - PM(Ic, Md) 


Function: 

Parallel accesses to data memory and program memory from the register 
file. The specified I registers address data memory and program memory. 
The I values are post-modified and updated by the specified M registers. 
Pre-modify offset addressing is not supported. 

Examples: 

R7=BSET R6 BY RO, DM (10, M3) =R5, PM (111 , Ml 5) =R4 ; 


R8=DM (14, Ml ) , PM (112 M12 ) =R0 ; 


Opcode: 

47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 



D 



P 

DM 



PM 

00 1 

M 

D 

DMI 

DMM 

M 

D 

DREG 

PMI 

PMM 

DREG 


22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


COMPUTE 


DMD and PMD select the access types (read or write). DMDREG and 
PMDREG specify register file locations. DMI and PMI specify I registers 
for data and program memory. DMM and PMM specify M registers used 
to update the I registers. The COMPUTE field defines a compute 
operation to be performed in parallel with the data accesses; this is a 
NOP if no compute operation is specified in the instruction. 
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Compute and Move 

compute 



Syntax: 

IF condition compute ; 

Function: 

Conditional compute instruction. The instruction is executed if the 
specified condition tests true. 

Examples: 

IF MS MRF = 0 ; 

F6= (F2+F3 ) / 2 ; 

Opcode: 


47 46 45 

44 43 42 41 40 

39 38 37 36 35 34 33 

32 31 30 29 28 27 26 25 24 23 

000 

0000 1 


COND 



22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


COMPUTE 


The operation specified in the COMPUTE field is executed if the condition 
specified by COND is true. If no condition is specified in the instruction, 
COND is the TRUE condition, and the compute operation is always 
executed. 
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Compute and Move 

compute / ureg^DMIPM , register modify 


Syntax: 


a. 

IF condition 

compute , 

DM(Ia, Mb) = ureg ; 
PM(Ic, Md) 

b. 

IF condition 

compute , 

DM(Mb, la) = ureg 
PM(Md, Ic) 

c. 

IF condition 

compute, 

ureg = DM(Ia, Mb) ; 
PM(Ic, Md) 

d. 

IF condition 

compute, 

ureg = DM(Mb, la) ; 
PM(Md, Ic) 

Function: 




Access between data memory or program memory and a universal 
register. The specified I register addresses data memory or program 
memory. The I value is either pre-modified (M, I order) or post-modified 
(I, M order) by the specified M register. If it is post-modified, the I register 
is updated with the modified value. If a compute operation is specified, it 
is performed in parallel with the data access. If a condition is specified, it 
affects entire instruction. 

Examples: 

R6=R3-R1 1 , DM (10, Ml) =ASTAT; 

IF NOT SV F8=CLIP F2 BY F14, PX=PM ( I 1 2 , Ml 2 ) ; 
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compute / ureg^DMIPM , register modify 


Opcode: 


47 46 45 

44 

43 42 41 

40 39 38 

37 36 35 34 33 

32 

31 

30 29 28 27 26 25 24 23 

0 1 0 

U 

1 

M 

COND 

G 

D 

UREG 


22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


COMPUTE 


COND specifies the condition to test. If no condition is specified in the 
instruction, COND is the TRUE condition, and the instruction is always 
executed. 

D selects the access type (read or write). G selects data memory or 
program memory. UREG specifies the universal register. I specifies the 
I register, and M specifies the M register. U selects either pre-modify 
without update or post-modify with update. The COMPUTE field defines 
a compute operation to be performed in parallel with the data access; this 
is a no-operation if no compute operation is specified in the instruction. 






Syntax: 


a. 

IF condition 

compute , 

DM(Ia, <data6>) = dreg ; 
PMflc, <data6>) 

b. 

IF condition 

compute, 

DM(<data6>, la) = dreg ; 
PM(<data6>, Ic) 

c. 

IF condition 

compute, 

dreg = DM(Ia, <data6>) ; 
PM(Ic, <data6>) 

d. 

IF condition 

compute, 

dreg = DM(<data6>, la) ; 
PM(<data6>, Ic) 

Function: 




Access between data memory or program memory and the register file. 
The specified I register addresses data memory or program memory. The I 
value is either pre-modified (data order, I) or post-modified (I, data order) 
by the specified immediate data. If it is post-modified, the I register is 
updated with the modified value. If a compute operation is specified, it is 
performed in parallel with the data access. If a condition is specified, it 
affects entire instruction. 

Examples: 

IF FLAG0_IN F1=F5*F12 , F11=PM(I10, 40) ; 

R12=R3 AND Rl, DM(6,I1)=R6; 
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Compute and love 

compute /dreg<>DMIPM , immediate modify 


Opcode: 


47 46 45 

44 

43 42 41 

40 

39 

38 

37 36 35 34 33 

32 31 30 29 28 27 

26 25 24 23 

0 1 1 

0 

1 

G 

D 

U 

COND 

DATA 

DREG 


22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


COMPUTE 


COND specifies the condition to test. If no condition is specified in the 
instruction, COND is the TRUE condition, and the instruction is always 
executed. 

D selects the access type (read or write). G selects data memory or 
program memory. DREG specifies the register file location. I specifies the 
I register. DATA specifies a 6-bit, twos-complement modify value. U 
selects either pre-modify without update or post-modify with update. The 
COMPUTE field defines a compute operation to be performed in parallel 
with the data access; this is a no-operation if no compute operation is 
specified in the instruction. 




Compute and Move 

compute / ureg^ureg 


Syntax: 


IF condition compute , uregl m ureg2 ; 


Function: 

Transfer from one universal register to another. If a compute operation is 
specified, it is performed in parallel with the data access. If a condition is 
specified, it affects entire instruction. 

Examples: 

IF TF MRF=R2*R6 (SSFR) , M4=R0; 


LCNTR=L7; 



Compute and Wove 

compute / ureg^>ureg 


Opcode: 


47 46 45 

44 

43 42 41 40 39 38 37 36 

35 34 33 32 31 

30 29 28 27 26 25 24 23 



Source 


Dest 

0 1 1 

1 

UREG 

COND 

UREG 


22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


COMPUTE 


COND specifies the condition to test. If no condition is specified in the 
instruction, COND is the TRUE condition, and the instruction is always 
executed. 

Source UREG identifies the universal register source. Dest UREG 
identifies the universal register destination. The COMPUTE field defines a 
compute operation to be performed in parallel with the data transfer; this 
is a no-operation if no compute operation is specified in the instruction. 






Compute and Move 

immediate shift / dreg<-*DMIPM 


Syntax: 



a. IF condition 

shiftimm 

, DM(Ia, Mb) = dreg ; 
PM(Ic, Md) 

b. IF condition 

shiftimm 

, dreg = DM(Ia,Mb) ; 

PM(Ic, Md) 

Function: 




An immediate shift operation is a shifter operation that takes immediate 
data as its Y-operand. The immediate data is one 8-bit value or two 6-bit 

values, depending on the operation. The x-operand and the result are 



i ctiioicri me; lucauuiia. 

If an access to data or program memory from the register file is specified, 
it is performed in parallel with the shifter operation. The I register 
addresses data or program memory. The I value is post-modified by the 
specified M register and updated with the modified value. If a condition is 
specified, it affects entire instruction. 

Examples: 

IF GT R2-R6 LSHIFT BY 30, DM ( 14 , M4 ) =R0 ; 

IF NOT SZ R3=FEXT R1 BY 8:4; 
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immediate shift / dreg<-*DMIPM 


Opcode: (with data access ) 


47 46 45 

44 

43 42 41 

40 39 38 

37 36 35 34 33 32 31 

30 29 28 27 

26 25 24 23 

1 00 

0 

1 

M 

COND 

G 

D 

DATAEX 

DREG 


22 

21 20 19 18 17 16 

15 14 13 12 11 10 9 8 

7 6 5 4 

3 2 10 

0 

SHIFTOP 

DATA 

RN 

RX 


Opcode: (without data access) 


47 46 45 44 43 42 41 

40 

39 38 

37 36 35 34 33 

32 31 

30 29 28 27 

26 25 24 23 

000 

00 0 1 0 


COND 


DATAEX 



22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


0 

SHIFTOP 

DATA 

RN 

RX 


COND specifies the condition to test. If no condition is specified in the 
instruction, COND is the TRUE condition, and the instruction is always 
executed. 

SHIFTOP specifies the shifter operation. The DATA field specifies an 8-bit 
immediate shift value. For shifter operations requiring two 6-bit values 
(a shift value and a length value), the DATAEX field adds 4 MSBs to the 
DATA field, creating a 12-bit immediate value. The six LSBs are the shift 
value, and the six MSBs are the length value. 

If a memory access is specified, D selects the access type (read or write). 

G selects data memory or program memory. DREG specifies the register 
file location. I specifies the I register, which is post-modified and updated 
by the M register identified by M. 

The COMPUTE field defines a compute operation to be performed in 
parallel with the data access; this is a no-operation if no compute 
operation is specified in the instruction. 








Compute and Move 

compute / modify 


Syntax: 


IF condition compute, 


MODIFY 


(la, Mb) 
(Ic, Md) 


Function: 

Update of the specified I register by the specified M register. If a compute 
operation is specified, it is performed in parallel with the data access. If a 
condition is specified, it affects entire instruction. 

Examples: 

IF NOT FT.AG2__IN R4=R6*R12 (SUF) , MODIFY ( 110 , MS) ; 

IF NOT LCE MODIFY (13, Ml) ; 

Opcode: 



COND specifies the condition to test. If no condition is specified in the 
instruction, COND is the TRUE condition, and the instruction is always 
executed. 

G selects DAG1 or DAG2. 1 specifies the I register, and M specifies the 
M register. The COMPUTE field defines a compute operation to be 
performed in parallel with the data access; this is a no-operation if no 
compute operation is specified in the instruction. 
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Group II. 


Program Flow Control 

8. Direct or PC-relative branch, optional condition A-24 

9. Indirect or PC-relative branch, optional condition, optional compute operation A-26 

1 1 . Return from subroutine or interrupt, optional condition, optional compute 

operation A-28 

12. Load loop counter, do loop until loop counter expired A-30 

13. Do until termination A-32 



Program Flow Control 

direct jumplcall 


Syntax: 


IF condition 

JUMP 


<addr24> 

( 

DB 


CALL 


(PC, <reladdr24>) 


LA 

DB, LA 


Function: 

A jump or call to the specified address or PC-relative address. The PC- 
relative address is a 24-bit, twos-complement value. If the delayed branch 
(DB) modifier is specified, the branch is delayed; otherwise, it is non- 
delay ed. If the loop abort (LA) modifier is specified for a jump, the loop 
stacks and PC stack are popped when the jump is executed. You should 
use the (LA) modifier if the jump will transfer program execution outside 
of the loop. If there is no loop, or if the jump address is within the loop, 
you should not use the (LA) modifier. The (LA) modifier does not affect a 
call. If a condition is specified, it affects entire instruction. 

Examples: 

IF AV JUMP (PC, 0x00A4) (LA); 

CALL init (DB) ; { init is user-defined label} 


Opcode: (with direct branch) 


47 46 45 44 43 42 41 40 

39 

38 37 36 35 34 33 

32 31 30 29 28 27 

26 

25 24 

000 

00110 

B 

A 

COND 


J 



23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


ADDR 
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Program Flow Control 

direct jumpicall 


Opcode: (with PC-relative branch) 

47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 



23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 



COND specifies the condition to test. If no condition is specified in the 
instruction, COND is the TRUE condition, and the instruction is always 
executed. 

B selects the branch type, jump or call. J determines whether the branch is 
delayed or non-delay ed. The ADDR field specifies a 24-bit program 
memory address. RELADDR is a 24-bit, twos-complement value that is 
added to the current PC value to generate the branch address. The A bit 
activates loop abort; a jump with loop abort pops the loop and PC stacks. 
(For calls, A is ignored.) 
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Proqram Flow Control 

indirect jumpicall / compute 


Syntax: 


IF condition 

JUMP 


(Md, Ic) 

( 

DB 


CALL 


(PC, <reladdr6>) 


LA 

DB, LA 


, compute ; 


Function: 

A jump or call to the specified PC-relative address or pre-modified I 
register value. The PC-relative address is a 6-bit, twos-complement value. 
If an I register is specified, it is modified by the specified M register to 
generate the branch address. The I register is not affected by the modify 
operation. 

If the delayed branch (DB) modifier is specified, the branch is delayed; 
otherwise, it is non-delay ed. If the loop abort (LA) modifier is specified for 
a jump, the loop stacks and PC stack are popped when the jump is 
executed. You should use the (LA) modifier if the jump will transfer 
program execution outside of the loop. If there is no loop, or if the jump 
address is within the loop, you should not use the (LA) modifier. The (LA) 
modifier does not affect a call. 

If a compute operation is specified, it is performed in parallel with the 
branch. If a condition is specified, it affects entire instruction. 

Examples: 

IF EQ JUMP (M8, 112) , R6=R6-1; 

CALL (PC, 17 ) (DB) , R12=MR2F; 


Opcode: (with indirect branch) 


47 46 45 44 43 42 41 40 

39 

38 

37 36 35 34 33 

32 31 30 

29 28 27 

26 

25 24 23 

000 

0 1000 

B 

A 

COND 

PMI 

PMM 

J 



22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


COMPUTE 
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Program Flow Control 

indirect jumplcall / compute 



Opcode: (with PC-relative branch) 



COND specifies the condition to test. If no condition is specified in the 
instruction, COND is the TRUE condition, and the instruction is always 
executed. 

B selects the branch type, jump or call. J determines whether the branch is 
delayed or non-delay ed. The A bit activates loop abort; a jump with loop 
abort pops the loop and PC stacks. (For calls, A is ignored.) 

RELADDR is a 6-bit, twos-complement value that is added to the current 
PC value to generate the branch address. PMI specifies the I register for 
indirect branches. The I register is pre-modified but not updated by the M 
register specified by PMM. 

The COMPUTE field defines a compute operation to be performed in 
parallel with the data access; this is a no-operation if no compute 
operation is specified in the instruction. 
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Program Flow Control 

return from subroutinelinterrupt/ compute 


Syntax: 


IF condition 


RTS ( 
RTI 


DB 


compute ; 


Function: 

A return from a subroutine (RTS) or from an interrupt service routine 
(RTI). If the delayed branch (DB) modifier is specified, the return is 
delayed; otherwise, it is non-delayed. 

If a compute operation is specified, it is performed in parallel with the 
branch. If a condition is specified, it affects entire instruction. 

Examples: 

RTI, R6=R5 XOR Rl; 

IF NOT GT RTS (DB) ; 

Opcode: (return from subroutine) 
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return from subroutinelinterrupt / compute 


Opcode: (return from interrupt) 



COND specifies the condition to test. If no condition is specified in the 
instruction, COND is the TRUE condition, and the instruction is always 
executed. 

J determines whether the return is delayed or non-delayed. The 
COMPUTE field defines a compute operation to be performed in parallel 
with the data access; this is a no-operation if no compute operation is 
specified in the instruction. 







Program Flow Control 

do until counter expired 


Syntax: 


LCNTR = 

<datal6> 

, DO 

<addr24> 


ureg 


(<PC, reladdr24>) 


UNTIL LCE ; 


Function: 

Sets up a counter-based program loop. The loop counter LCNTR is loaded 
with 16-bit immediate data or from a universal register. The loop start 
address is pushed on the PC stack. The loop end address and the LCE 
termination condition are pushed on the loop address stack. The end 
address can be either a label for an absolute 24-bit program memory 
address, or a PC-relative 24-bit twos-complement address. The LCNTR is 
pushed on the loop counter stack and becomes the CURLCNTR value. 

The loop executes until the CURLCNTR reaches zero. 

Examples: 

LCNTR=100, DO fmax UNTIL LCE; { f max is a program label} 

LCNTR=R12 , DO (PC, 16) UNTIL LCE; 
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do until counter expired 


Opcode: (with loop counter load from a universal register) 



RELADDR specifies the end-of-loop address relative to the DO LOOP 
instruction address. (The Assembler accepts an absolute address as well; 
it converts the absolute address to the equivalent relative address for 
coding.) The loop counter (LCNTR) is loaded with the 16-bit DATA value 
or with the contents of the register specified by UREG. 





Program Flow Control 

^ do until 


Syntax: 


DO 


<addr24> 

(PC, <reladdr24>) 


UNTIL termination ; 


Function: 

Sets up a condition-based program loop. The loop start address is pushed 
on the PC stack. The loop end address and the termination condition are 
pushed on the loop stack. The end address can be either a label for an 
absolute 24-bit program memory address or a PC-relative, 24-bit twos- 
complement address. The loop executes until the termination condition 
tests true. 

Examples: 

DO end UNTIL FLAG1_IN; {end is a program label} 

DO (PC, 7) UNTIL AC; 


Opcode: (relative addressing) 


47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 

32 31 30 29 28 27 26 25 24 

000 

0 1110 


TERM 


23 22 21 

20 19 18 17 16 

15 14 

13 12 11 10 9 

876543210 


RELADDR 


RELADDR specifies the end-of-loop address relative to the DO LOOP 
instruction address. (The Assembler accepts an absolute address as well; it 
converts the absolute address to the equivalent relative address for 
coding.) TERM specifies the termination condition. 
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Instruction Set Reference A 

Group III. 

I 1 x II 


Immediate Move 

14. Transfer between data or program memory and universal register, direct 

addressing, immediate address A-34 

15. Transfer between data or program memory and universal register, indirect 

addressing, immediate modifier A-35 

16. Immediate data write to data or program memory A-36 

17. Immediate data write to universal register A-37 
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Immediate Move 

ureg^DMIPM (direct addressing) 


Syntax: 


a. 

1 DM(<addr32>) 1 = ureg ; 

1 PM(<addr24>) 1 

b. 

ureg = | DM(<addr32>) | ; 


PM(<addr24>) 


Function: 

Access between data memory or program memory and a universal 
register, with direct addressing. The entire data memory or program 
memory address is specified in the instruction. Data memory addresses 
are 32 bits wide (0 to 2~~ — 1). Program memory addresses are Dits wine 

(0 to 2 24 -l). 

Examples: 

DM (temp) =MODEl ; {temp is a program label} 

DMWAIT=PM (0x489060) ; 

Opcode: 


47 46 45 

44 43 42 

41 

40 

39 38 37 36 35 34 33 32 

000 

1 0 0 

G 

D 

UREG 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 0 


ADDR 


D selects the access type (read or write). G selects the memory type (data 
or program). UREG specifies the number of a universal register. ADDR 
contains the immediate address value. 
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Immediate Move 

ureg< >DMIPM (indirect addressing) 



Syntax: 


a. 


DM(<data32>, la) 
PM(<data24>, Ic) 


ureg; 


b. 


ureg = 


DM(<data32>, la) 
PM(<data24>, Ic) 


Function: 

Access between data memory or program memory and a universal 
register, with indirect addressing using I registers. The I register is 
pre-modified with an immediate value specified in the instruction. The 
I register is not updated. Data memory address modifiers are 32 bits 
wide (0 to 2 32 -l). Program memory address modifiers are 24 bits wide 
(0 to 224-1). 

Examples: 

DM (24, 15) =TCOUNT; 

USTAT1=PM (off s, 113) ; {offs is a defined constant} 

Opcode: 


47 46 45 44 43 42 41 

40 

39 38 37 36 35 34 33 32 

1 0 1 

G 

1 

D 

UREG 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 0 


DATA 


D selects the access type (read or write). G selects the memory type (data 
or program). UREG specifies the number of a universal register. ADDR 
contains the immediate address value. The I field specifies the I register. 
The DATA field specifies the immediate modify value for the I register. 
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Immediate Move 

immediate data -> DMIPM 


Syntax: 


DM(Ia, Mb) 
PM(Ic, Md) 


= <data32> ; 


Function: 

A write of 32-bit immediate data to data or program memory, with 
indirect addressing. The data is placed in the most significant 32 bits of the 
40-bit memory word. The least significant 8 bits are loaded with Os. The 
I register is post-modified and updated by the specified M register. 

Examples: 

DM (14 , MO) =1 9304 ; 


PM (114, Mil) =count; {count is user-defined constant} 


Opcode: 

47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 


1 00 

1 

1 

M 

G 



31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 0 


DATA 


I selects the I register, and M selects the M register. G selects the memory 
(data or program). DATA specifies the 32-bit immediate data. 
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immediate Move 

immediate data -► ureg 



Syntax: 

ureg = <data32> ; 

Function: 

A write of 32-bit immediate data to a universal register. If the register is 40 
bits wide, the data is placed in the most significant 32 bits, and the least 
significant 8 bits are loaded with Os. 

Examples: 

IMASK=0xFFFC00 60 ; 

M15=modl; {modi is user-defined constant} 


Opcode: 

47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 



UREG specifies the number of a universal register. The DATA field 
specifies the immediate data value. 


A -37 







A -38 


Instruction Set Reference A 


Group IV. 

II! 111 


Miscellaneous 

18. System register bit manipulation A-40 

19. Immediate I register modify, with or without bit-reverse A-42 

20. Push or Pop of loop and/or status stacks A-44 

21. No operation (NOP) A-45 

22. Idle A-46 
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system register bit manipulation 


Syntax: 


BIT 


SET 

CLR 

TGL 

TST 

XOR 


sreg <data32> ; 


Function: 

A bit manipulation operation on a system register. This instruction can set, 
clear, toggle or test specified bits, or compare (XOR) the system register 
with a specified data value. In the first four operations, the immediate 
dsttci value is a mask. The set operation sets all the bits in the specified, 
system register that are also set in the specified data value. The clear 
operation clears all the bits that are set in the data value. The toggle 
operation toggles all the bits that are set in the data value. The test 
operation sets the bit test flag (BTF in AST AT) if all the bits that are set in 
the data value are also set in the system register. The XOR operation sets 
the bit test flag (BTF in AST AT) if the system register value is the same as 
the data value. 

See shifter instructions for bit manipulation of data in the register file. See 
Appendix E for more information on system registers. 

Examples: 

BIT SET MODE 2 0x00000070; 


BIT TST ASTAT 0x00002000; 



Miscellaneous 

system register bit manipulation 



Opcode: 


47 46 45 44 43 42 41 40 

39 38 37 36 

35 34 33 32 

000 

10 100 

BOP 

i 

SREG 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 0 


DATA 


BOP selects one of the five bit operations. SREG specifies the system 
register. DATA specifies the data value. 
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Miscellaneous 

I register modify / bit-reverse 


Syntax: 




a. 

MODIFY 

(la, <data32>) 

/ 



(Ic, <data24>) 


b. 

BITREV 

(la, <data32>) 

/ 


Function: 

Modifies and updates the specified I register I by an immediate 32-bit 
(DAG1) or 24-bit (DAG2) data value. If the address is to be bit-reversed, 
you must specify a DAG1 register (10-17), and the modified value is 
bit-reversed before being written back to the I register. No address is 
output in either case. 

Examples: 

BITREV (17, space); {space is a defined constant} 

MODIFY (14,304); 

Opcode: (without bit-reverse) 


47 46 45 

44 43 42 41 

40 39 38 

37 36 35 

34 33 32 

000 

10 110 

1 

G 


1 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


DATA 
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I register modify / bit-reverse 



Opcode: (with bit-reverse) 


47 46 45 44 43 42 41 40 

39 

38 

37 36 35 

34 33 32 

000 

10 110 

1 

0 


1 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 0 


DATA 


G selects the address generator (DAG1 or DAG2). I selects the I register. 
DATA specifies the immediate modifier. 
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i Miscellaneous 

pushlpop stacks 


Syntax: 


PUSH 

LOOP , 

PUSH 

POP 


POP 


STS ; 


Function: 

Pushes or pops the loop address and loop counter stacks, and/or pushes 
or pops the status stack. 

Examples: 

PUSH LOOP, PUSH STS; 

POP STS; 


Opcode: 

47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 




n 

L 

m 

s 


000 

10 111 

B 

P 

□ 

p 




D 

O 

D 

0 



31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 0 


LPU pushes the loop stacks. LPO pops the loop stacks. SPU pushes the 
status stack, and SPO pops the status stack. 
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Syntax: 

NOP; 


Miscellaneous 

nop 


II 

r# 


Function: 

A null operation; only increments the fetch address. 

Opcode: 


47 46 45 44 43 42 41 40 

39 

38 37 36 35 34 33 32 

000 

0 0000 

0 



31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 0 


A -45 






Miscellaneous 

idle 


Syntax: 

IDLE; 

Function: 

Executes a NOP and puts the processor in a low power state. The 
processor remains in the low power state until an external interrupt 
occurs. 

On return from the interrupt, execution continues at the instruction 
following the IDLE instruction. 

Opcode: 


47 46 45 

44 43 42 41 40 

39 

38 37 36 35 34 33 32 

000 

00000 

1 



31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 0 
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Compute Operation 

Reference 


B.1 OVERVIEW 

Compute operations execute in the multiplier, the ALU and the shifter. 

The 23-bit compute field is like a mini-instruction within the ADSP-21000 
instruction and can be specified for a variety of compute operations. This 
appendix describes each compute operation in detail, including its 
assembly language syntax and opcode field. 

A compute operation is one of the following: 

• Single-function operations involve a single computation unit. 

• Multifunction operations specify parallel operation of the multiplier and 
the ALU or two operations in the ALU. 

• The MR register transfer is a special type of compute operation 
dedicated to accessing the fixed-point accumulator in the multiplier. 

(See p. B-52). 

The operations in each category are described in the following sections. 

For each operation, the assembly language syntax, the function, and the 
opcode format and contents are specified. Refer to the beginning of 
Appendix A for an explanation of the notation and abbreviations used. 


B.2 SINGLE-FUNCTION OPERATIONS 

The compute field of a single-function operation looks like: 


22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 


0 cu 


OPCODE 


RN 


RX 


RY 


An operation determined by OPCODE is executed in the computation unit 
specified by CU. The x- and the y-operands are received from data 
registers RX and RY. The result operand is returned to data register RN. 





The CU (computation unit) field is defined as follows: 


CU=00 ALU operations 

CU=01 Multiplier operations 

CU=10 Shifter operations 

In some shifter operations, data register RN is used both as a destination 
for a result operand and as source for a third input operand. 

The available operations and their 8-bit OPCODE values are listed in the 
following sections, organized by computation unit: ALU, multiplier and 
shifter. In each section, the syntax and opcodes for the operations are first 
summarized and then the operations are described in detail. 


B.2.1 ALU Operations 

The ALU operations are described in this section. Tables B.l and B.2 
summarize the syntax and opcodes for the fixed-point and floating-point 
ALU operations, respectively. The rest of this section contains detailed 
descriptions of each operation. 


Syntax 

Opcode 

Rn = Rx + Ry 

0000 0001 

Rn = Rx - Ry 

0000 0010 

Rn = Rx + Ry + Cl 

0000 0101 

Rn = Rx - Ry + Cl - 1 

0000 0110 

Rn = (Rx + Ry)/2 

0000 1001 

COMP(Rx, Ry) 

0000 1010 

Rn = Rx + Cl 

0010 0101 

Rn = Rx + Cl - 1 

0010 0110 

Rn = Rx + 1 

0010 1001 

Rn = Rx - 1 

0010 1010 

Rn = -Rx 

0010 0010 

Rn = ABS Rx 

0011 0000 

Rn = PASS Rx 

0010 0001 

Rn = Rx AND Ry 

0100 0000 

Rn = Rx OR Ry 

0100 0001 

Rn = Rx XOR Ry 

0100 0010 

Rn = NOT Rx 

0100 0011 

Rn = MIN(Rx, Ry) 

0110 0001 

Rn = MAX(Rx, Ry) 

0110 0010 

Rn = CLIP Rx BY Ry 

0110 0011 

Table B.l Fixed-Point ALU Operations 




Compute Operation 


Syntax 
Fn = Fx + Fy 
Fn = Fx - Fy 
Fn = ABS (Fx + Fy) 

Fn = ABS (Fx - Fy) 

Fn = (Fx + Fy)/2 
COMP(Fx, Fy) 

Fn = -Fx 

Fn = ABS Fx 

Fn = PASS Fx 

Fn = RND Fx 

Fn = SCALB Fx BY Ry 

Rn = MANT Fx 

Rn = LOGB Fx 

Rn = FIX Fx BY Ry 

Rn = FIX Fx 

Fn = FLOAT Rx BY Ry 

Fn = FLOAT Rx 

Fn = RECIPS Fx 

Fn = RSQRTS Fx 

Fn = Fx COPYSIGN Fy 

Fn = MIN(Fx, Fy) 

Fn = MAX(Fx, Fy) 

Fn = CLIP Fx BY Fy 

Table B.2 Floating-Point ALU Operations 


Opcode 
1000 0001 
1000 0010 
1001 0001 
1001 0010 
1000 1001 
1000 1010 
1010 0010 
1011 0000 
1010 0001 
1010 0101 
1011 1101 
10101101 
1100 0001 
1101 1001 
1100 1001 
1101 1010 
1100 1010 
1100 0100 
1100 0101 
1110 0000 
1110 0001 
1110 0010 
1110 0011 
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ALU Fixed-Point 

Rn = Rx + Ry 

Syntax: 

Rn = Rx + Ry 


Function: 

Adds the fixed-point fields in registers Rx and Ry. The result is placed in 
the fixed-point field in register Rn. The floating-point extension field in Rn 
is set to all Os. In saturation mode (the ALU saturation mode bit in 
MODE1 set) positive overflows return the maximum positive number 
(0x7FFF FFFF), and negative overflows return the minimum negative 
number (0x8000 0000). 

Siaius nags: 

AZ Is set if the fixed-point output is all 0s, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 



ALU Fixed-Point 

Rn = Rx - Ry 


Syntax: 

Rn = Rx - Ry 


Function: 

Subtracts the fixed-point field in register Ry from the fixed-point field in 
register Rx. The result is placed in the fixed-point field in register Rn. The 
floating-point extension field in Rn is set to all Os. In saturation mode (the 
ALU saturation mode bit in MODE1 set) positive overflows return the 
maximum positive number (0x7FFF FFFF), and negative overflows return 
the minimum negative number (0x8000 0000). 

Status flags: 

AZ Is set if the fixed-point output is all 0s, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 
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ALU Fixed-Point 

Rn = Rx + Ry + Cl 

Syntax: 

Rn = Rx + Ry + Cl 


Function: 

Adds with carry (AC from ASTAT) the fixed-point fields in registers Rx 
and Ry. The result is placed in the fixed-point field in register Rn. The 
floating-point extension field in Rn is set to all Os. In saturation mode (the 
ALU saturation mode bit in MODE1 set) positive overflows return the 
maximum positive number (0x7FFF FFFF), and negative overflows return 
the minimum negative number (0x8000 0000). 

Status flags: 

AZ Is set if the fixed-point output is all 0s, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 



ALU Fixed-Point 

Rn = Rx - Ry + Cl - 1 


Syntax: 

Rn = Rx - Ry + Cl - 1 

Function: 

Subtracts with borrow (AC - 1 from AST AT) the fixed-point field in 
register Ry from the fixed-point field in register Rx. The result is placed in 
the fixed-point field in register Rn. The floating-point extension field in Rn 
is set to all Os. In saturation mode (the ALU saturation mode bit in 
MODE1 set) positive overflows return the maximum positive number 
(0x7FFF FFFF), and negative overflows return the minimum negative 
number (0x8000 0000). 

Status flags: 

AZ Is set if the fixed-point output is all 0s, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 


Rn = (Rx + Ry)/2 


Syntax: 

Rn = (Rx + Ry)/2 


Function: 

Adds the fixed-point fields in registers Rx and Ry and divides the result 
by 2. The result is placed in the fixed-point field in register Rn. The 
floating-point extension field in Rn is set to all Os. Rounding is to nearest 
(IEEE) or by truncation, as defined by the rounding mode bit in the 
MODE1 register. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 

AI Is cleared 



COMP(Rx, Ry) 


Syntax: 

COMP(Rx, Ry) 

Function: 

Compares the fixed-point field in register Rx with the fixed-point field in 
register Ry. Sets the AZ flag if the two operands are equal, and the AN 
flag if the operand in register Rx is smaller than the operand in register 
Ry. 

The AST AT register stores the results of the previous eight ALU compare 
operations in bits 24-31. These bits are shifted right (bit 24 is overwritten) 
whenever a fixed-point or floating-point compare instruction is executed. 
The MSB of AST AT is set if the X operand is greater than the Y operand 
(its value is the AND of ~ AZ and -AN); it is otherwise cleared. 

Status flags: 

AZ Is set if the operands in registers Rx and Ry are equal, otherwise 
cleared 

AU Is cleared 

AN Is set if the operand in the Rx register is smaller than the operand in 
the Ry register, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 


ALU Fixed-Point 

Rn = Rx + Cl 

Syntax: 

Rn = Rx + Cl 


Function: 

Adds the fixed-point field in register Rx with the carry flag from the 
ASTAT register (AC). The result is placed in the fixed-point field in 
register Rn. The floating-point extension field in Rn is set to all Os. In 
saturation mode (the ALU saturation mode bit in MODE1 set) positive 
overflows return the maximum positive number (0x7FFF FFFF). 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 




Rn = Rx + Cl - 1 



Syntax: 

Rn = Rx + Cl - 1 

Function: 

Adds the fixed-point field in register Rx with the borrow from the AST AT 
register (AC - 1). The result is placed in the fixed-point field in register Rn. 
The floating-point extension field in Rn is set to all Os. In saturation mode 
(the ALU saturation mode bit in MODE1 set) positive overflows return the 
maximum positive number (0x7FFF FFFF). 

Status flags: 

A Z Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 



B-11 



ALU Fixed-Point 

Rn = Rx + 1 


Syntax: 

Rn = Rx + 1 

Function: 

Increments the fixed-point operand in register Rx. The result is placed in 
the fixed-point field in register Rn. The floating-point extension field in Rn 
is set to all Os. In saturation mode (the ALU saturation mode bit in 
MODE1 set), overflow causes the maximum positive number 
(0x7FFF FFFF) to be returned. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder, 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 
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ALU Fixed-Point 

Rn = Rx-1 



Syntax: 

Rn = Rx - 1 


Function: 

Decrements the fixed-point operand in register Rx. The result is placed in 
the fixed-point field in register Rn. The floating-point extension field in Rn 
is set to all Os. In saturation mode (the ALU saturation mode bit in 
MODE1 set), underflow causes the minimum negative number 
(0x8000 0000) to be returned. 

Status flags: 

AZ Is set if the fixed-point output is all 0s, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 
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ALU Fixed-Point 

Rn = -Rx 


Syntax: 

Rn = -Rx 

Function: 

Negates the fixed-point operand in Rx by twos complement. The result is 
placed in the fixed-point field in register Rn. The floating-point extension 
field in Rn is set to all Os. Negation of the minimum negative number 
(0x8000 0000) causes an overflow. In saturation mode (the ALU saturation 
mode bit in MODE1 set), overflow causes the maximum positive number 
(0x7FFF FFFF) to be returned. 

Status flags: 

AZ Is set if the fixed-point output is all 0s 
AU Is cleared 

AN Is set if the most significant output bit is 1 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 
AS Is cleared 
AI Is cleared 



Rn = ABS Rx 


Syntax: 

Rn = ABS Rx 

Function: 

Determines the absolute value of the fixed-point operand in Rx. The result 
is placed in the fixed-point field in register Rn. The floating-point 
extension field in Rn is set to all Os. ABS of the minimum negative number 
(0x8000 0000) causes an overflow. In saturation mode (the ALU saturation 
mode bit in MODE1 set), overflow causes the maximum positive number 
(0x7FFF FFFF) to be returned. 

Status flags: 

AZ Is set if the fixed-point output is all 0s, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is set if the XOR of the carries of the two most significant adder 
stages is 1, otherwise cleared 

AC Is set if the carry from the most significant adder stage is 1, 
otherwise cleared 

AS Is set if the fixed-point operand in Rx is negative, otherwise cleared 
AI Is cleared 



ALU Fixed-Point 

Rn = PASS Rx 


Syntax: 

Rn = PASS Rx 

Function: 

Passes the fixed-point operand in Rx through the ALU to the fixed-point 
field in register Rn. The floating-point extension field in Rn is set to all Os. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 
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ALU Fixed-Point 

Rn = Rx AND Ry 



Syntax: 

Rn = Rx AND Ry 

Function: 

Logically ANDs the fixed-point operands in Rx and Ry. The result is 
placed in the fixed-point field in Rn. The floating-point extension field in 
Rn is set to all Os. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 
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D ALU Fixed-Point 

Rn = Rx OR Ry 


Syntax: 

Rn = Rx OR Ry 

Function: 

Logically ORs the fixed-point operands in Rx and Ry. The result is placed 
in the fixed-point field in Rn. The floating-point extension field in Rn is set 
to all Os. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 
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ALU Fixed-Point 

Rn = Rx XOR Ry 



Syntax: 

Rn = Rx XOR Ry 

Function: 

Logically XORs the fixed-point operands in Rx and Ry. The result is 
placed in the fixed-point field in Rn. The floating-point extension field in 
Rn is set to all Os. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 
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Rn = NOT Rx 


Syntax: 

Rn = NOT Rx 

Function: 

Logically complements the fixed-point operand in Rx. The result is placed 
in the fixed-point field in Rn. The floating-point extension field in Rn is set 
to all Os. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is \, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 



Syntax: 


ALU Fixed-Point 

Rn = MIN(Rx, Ry) 



Rn = MIN(Rx, Ry) 

Function: 

Returns the smaller of the two fixed-point operands in Rx and Ry. The 
result is placed in the fixed-point field in register Rn. The floating-point 
extension field in Rn is set to all Os. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 
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ALU Fixed-Point 

Rn = MAX(Rx, Ry) 


Syntax: 

Rn = MAX(Rx, Ry) 

Function: 

Returns the larger of the two fixed-point operands in Rx and Ry. The 
result is placed in the fixed-point field in register Rn. The floating-point 
extension field in Rn is set to all Os. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 
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Syntax: 


ALU Ffied-Point 


Rn = CLIP Rx BY Ry 


Rn = CLIP Rx BY Ry 

Function: 

Returns the fixed-point operand in Rx if the absolute value of the operand 
in Rx is less than the absolute value of the fixed-point operand in Ry. 
Otherwise, returns I Ry I if Rx is positive, and - 1 Ry I if Rx is negative. The 
result is placed in the fixed-point field in register Rn. The floating-point 
extension field in Rn is set to all Os. 

Status flags: 

AZ Is set if the fixed-point output is all Os, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 
AI Is cleared 
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ALU Floating-Point 

Fn = Fx + Fy 


Syntax: 

Fn = Fx + Fy 

Function: 

Adds the floating-point operands in registers Fx and Fy. The normalized 
result is placed in register Fn. Rounding is to nearest (IEEE) or by 
truncation, to a 32-bit or to a 40-bit boundary, as defined by the rounding 
mode and rounding boundary bits in MODE1 . Post-rounded overflow 
returns llnfinity (round-to-nearest) or ±NORM.MAX (round-to-zero). 
Post-rounded denormal returns ±Zero. Denormal inputs are flushed to 
±Zero. A NAN input returns an all Is result. 

Status flags: 

AZ Is set if the post-rounded result is a denormal 

(unbiased exponent < -126) or zero, otherwise cleared 
AU Is set if the post-rounded result is a denormal, otherwise cleared 
AN Is set if the floating-point result is negative, otherwise cleared 
AV Is set if the post-rounded result overflows 

(unbiased exponent > +127), otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, or if they are 
opposite-signed Infinities, otherwise cleared 
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ALU Floating-Point 

Fn = Fx - Fy 


Syntax: 

Fn = Fx - Fy 

Function: 

Subtracts the floating-point operand in register Fy from the floating-point 
operand in register Fx. The normalized result is placed in register Fn. 
Rounding is to nearest (IEEE) or by truncation, to a 32-bit or to a 40-bit 
boundary, as defined by the rounding mode and rounding boundary bits 
in MODEL Post-rounded overflow returns ±Infinity (round-to-nearest) or 
+NORM.MAX (round-to-zero). Post-rounded denormal returns ±Zero. 
Denormal inputs are flushed to ±Zero. A NAN input returns an all Is 
result. 

Status flags: 

AZ Is set if the post-rounded result is a denormal 

(unbiased exponent < -126) or zero, otherwise cleared 
AU Is set if the post-rounded result is a denormal, otherwise cleared 
AN Is set if the floating-point result is negative, otherwise cleared 
AV Is set if the post-rounded result overflows 

(unbiased exponent > +127), otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, or if they are like- 
signed Infinities, otherwise cleared 
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D ALU Floating-Point 

D Fn = ABS (Fx + Fy) 


Syntax: 

Fn = ABS (Fx + Fy) 

Function: 

Adds the floating-point operands in registers Fx and Fy, and places the 
absolute value of the normalized result in register Fn. Rounding is to 
nearest (IEEE) or by truncation, to a 32-bit or to a 40-bit boundary, as 
defined by the rounding mode and rounding boundary bits in MODEL 
Post-rounded overflow returns +Infinity (round-to-nearest) or 
+NORM.MAX (round-to-zero). Post-rounded denormal returns +Zero. 
Denormal inputs are flushed to ±Zero. A NAN input returns an all Is 


Status flags: 

AZ Is set if the post-rounded result is a denormal 

(unbiased exponent < -126) or zero, otherwise cleared 
AU Is set if the post-rounded result is a denormal, otherwise cleared 
AN Is cleared 

AV Is set if the post-rounded result overflows 

(unbiased exponent > +127), otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, or if they are 
opposite-signed Infinities, otherwise cleared 
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ALU Floating-Point 

Fn = ABS (Fx - Fy) 


Syntax: 

Fn = ABS (Fx-Fy) 

Function: 

Subtracts the floating-point operand in Fy from the floating-point operand 
in Fx and places the absolute value of the normalized result in register Fn. 
Rounding is to nearest (IEEE) or by truncation, to a 32-bit or to a 40-bit 
boundary, as defined by the rounding mode and rounding boundary bits 
in MODEL Post-rounded overflow returns +Infinity (round-to-nearest) or 
+NORM.MAX (round-to-zero). Post-rounded denormal returns +Zero. 
Denormal inputs are flushed to ±Zero. A NAN input returns an all Is 
result. 

Status flags: 

AZ Is set if the post-rounded result is a denormal 

(unbiased exponent < -126) or zero, otherwise cleared 
AU Is set if the post-rounded result is a denormal, otherwise cleared 
AN Is cleared 

AV Is set if the post-rounded result overflows 

(unbiased exponent > +127), otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, or if they are like- 
signed Infinities, otherwise cleared 



ALU Floating-Point 

Fn = (Fx + Fy)/2 


Syntax: 

Fn = (Fx + Fy)/2 

Function: 

Adds the floating-point operands in registers Fx and Fy and divides the 
result by 2, by decrementing the exponent of the sum before rounding. 

The normalized result is placed in register Fn. Rounding is to nearest 
(IEEE) or by truncation, to a 32-bit or to a 40-bit boundary, as defined by 
the rounding mode and rounding boundary bits in MODEl. Post-rounded 
overflow returns ±Infinity (round-to-nearest) or ±NORM.MAX (round-to- 
zero). Post-rounded denormal results return ±Zero. A denormal input is 
flushed to iZero. A NAN input returns an all Is result. 

Status flags: 

AZ Is set if the post-rounded result is a denormal 

(unbiased exponent < -126) or zero, otherwise cleared 
AU Is set if the post-rounded result is a denormal, otherwise cleared 
AN Is set if the floating-point result is negative, otherwise cleared 
AV Is set if the post-rounded result overflows 

(unbiased exponent > +127), otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, or if they are 
opposite-signed Infinities, otherwise cleared 
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ALU Floating-Point 

COMP(Fx, Fy) 


Syntax: 

COMP(Fx, Fy) 

Function: 

Compares the floating-point operand in register Fx with the floating-point 
operand in register Fy. Sets the AZ flag if the two operands are equal, and 
the AN flag if the operand in register Fx is smaller than the operand in 
register Fy. 

The AST AT register stores the results of the previous eight ALU compare 
operations in bits 24-31. These bits are shifted right (bit 24 is overwritten) 
whenever a fixed-point or floating-point compare instruction is executed. 
The MSB of AST AT is set if the X operand is greater than the Y operand 
(its value is the AND of ~AZ and -AN); it is otherwise cleared. 

Status flags: 

AZ Is set if the operands in registers Fx and Fy are equal, otherwise 
cleared 

AU Is cleared 

AN Is set if the operand in the Fx register is smaller than the operand in 
the Fy register, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, otherwise cleared 


ALU Floating-Point 

Fn = -Fx 


Syntax: 

Fn = -Fx 


Function: 

Complements the sign bit of the floating-point operand in Fx. The 
complemented result is placed in register Fn. A denormal input is flushed 
to ±Zero. A NAN input returns an all Is result. 

Status flags: 

AZ Is set if the result operand is a ±Zero, otherwise cleared 
AU Is cleared 

AN Is set if the floating-point result is negative, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 

AI Is set if the input operand is a NAN, otherwise cleared 
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ALU Floating-Point 

Fn = ABS Fx 



Syntax: 

Fn = ABS Fx 

Function: 

Returns the absolute value of the floating-point operand in register Fx by 
setting the sign bit of the operand to 0. Denormal inputs are flushed to 
+Zero. A NAN input returns an all Is result. 

Status flags: 

AZ Is set if the result operand is +Zero, otherwise cleared. 

AU Is cleared 
AN Is cleared 
AV Is cleared 
AC Is cleared 

AS Is set if the input operand is negative, otherwise cleared 
AI Is set if the input operand is a NAN, otherwise cleared 
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ALU Floating-Point 

Fn = PASS Fx 


Syntax: 

Fn = PASS Fx 

Function: 

Passes the floating-point operand in Fx through the ALU to the floating- 
point field in register Fn. Denormal inputs are flushed to ±Zero. A NAN 
input returns an all Is result. 

Status flags: 

AZ Is set if the result operand is a ±Zero, otherwise cleared 
AU Is cleared 

AN Is set if the floating-point result is negative, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 

AI Is set if the input operand is a NAN, otherwise cleared 



ALU Floating-Point 

Fn = RND Fx 


Syntax: 

Fn = RND Fx 

Function: 

Rounds the floating-point operand in register Fx to a 32 bit boundary. 
Rounding is to nearest (IEEE) or by truncation, as defined by the rounding 
mode bit in MODEL Post-rounded overflow returns ilnfinity (round-to- 
nearest) or ±NORM.MAX (round-to-zero). A denormal input is flushed to 
±Zero. A NAN input returns an all Is result. 

Status flags: 

AZ Is set if the result operand is a ±Zero, otherwise cleared 
AU Is cleared 

AN Is set if the floating-point result is negative, otherwise cleared 
AV Is set if the post-rounded result overflows 

(unbiased exponent > +127), otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if the input operand is a NAN, otherwise cleared 



ALU Floating-Point 

Fn = SCALB Fx BY Ry 


Syntax: 

Fn = SCALB Fx BY Ry 

Function: 

Scales the exponent of the floating-point operand in Fx by adding to it the 
fixed-point twos-complement integer in Ry. The scaled floating-point 
result is placed in register Fn. Overflow returns ±Infinity (round-to- 
nearest) or ±NORM.MAX (round-to-zero). Denormal returns ±Zero. 
Denormal inputs are flushed to ±Zero. A NAN input returns an all Is 
result. 

Status flags: 

AZ Is set if the result is a denormal (unbiased exponent < -126) or zero, 
otherwise cleared 

AU Is set if the post-rounded result is a denormal, otherwise cleared 
AN Is set if the floating-point result is negative, otherwise cleared 
AV Is set if the result overflows (unbiased exponent > +127), otherwise 
cleared 

AC Is cleared 
AS Is cleared 

AI Is set if the input is a NAN, an otherwise cleared 



Syntax: 


ALU Floating-Point 

Rn = MANT Fx 


Rn = MANT Fx 

Function: 

Extracts the mantissa (fraction bits with explicit hidden bit, excluding the 
sign bit) from the floating-point operand in Fx. The unsigned-magnitude 
result is left-justified (1.31 format) in the fixed-point field in Rn. Rounding 
modes are ignored and no rounding is performed because all results are 
inherently exact. Denormal inputs are flushed to ±Zero. A NAN or an 
Infinity input returns an all Is result (-1 in signed fixed-point format). 

Status flags: 

AZ Is set if the result is zero, otherwise cleared 
AU Is cleared 

AN Is cleared 

AV Is cleared 

AC Is cleared 

AS Is set if the input is negative, otherwise cleared 

AI Is set if the input operands is a NAN or an Infinity, otherwise 

cleared 
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ALU Floating-Point 

Rn = LOGB Fx 


Syntax: 

Rn = LOGB Fx 

Function: 

Converts the exponent of the floating-point operand in register Fx to an 
unbiased twos-complement fixed-point integer. The result is placed in the 
fixed-point field in register Rn. Unbiasing is done by subtracting 127 from 
the floating-point exponent in Fx. If saturation mode is not set, a ±Infinity 
input returns a floating-point +Infinity and a ±Zero input returns a 
floating-point -Infinity. If saturation mode is set, a ±Infinity input returns 
the maximum positive value (0x7FFF FFFF) and a ±Zero input returns the 
maximum negative value (0x8000 0000). Denormal inputs are flushed to 
±Zero. A NAN input returns an all Is result. 

Status flags: 

AZ Is set if the fixed-point result is zero, otherwise cleared 
AU Is cleared 

AN Is set if the result is negative, otherwise cleared 
AV Is set if the input operand is an Infinity or a Zero, otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if the input is a NAN, otherwise cleared 


B-36 



Rn = FIX Fx BY Ry / Rn = FIX Fx 


Syntax: 

Rn = FIX Fx BY Ry 
Rn = FIX Fx 

Function: 

Converts the floating-point operand in Fx to a twos-complement 32-bit 
fixed-point integer result. If a scaling factor (Ry) is specified, the fixed- 
point twos-complement integer in Ry is added to the exponent of the 
floating-point operand in Fx before the conversion. The result of the 
conversion is right-justified (32.0 format) in the fixed-point field in register 
Rn. The floating-point extension field in Rn is set to all Os. In saturation 
mode (the ALU saturation mode bit in MODE1 set) positive overflows 
and +Infinity return the maximum positive number (0x7FFF FFFF), and 
negative overflows and -Infinity return the minimum negative number 
(0x8000 0000). 

Rounding is to nearest (IEEE) or by truncation, as defined by the rounding 
mode bit in MODEL A NAN input returns a floating-point all Is result. If 
saturation mode is not set, an Infinity input or a result that overflows 
returns a floating-point all Is result. All positive underflows return zero. 
Negative underflows that are rounded- to-nearest return zero, and 
negative underflows that are rounded by truncation return 
-1 (OxFFFFFFFFOO). 

Status flags: 

AZ Is set if the fixed-point result is Zero, otherwise cleared 
AU Is set if the pre-rounded result is a denormal, otherwise cleared 
AN Is set if the fixed-point result is negative, otherwise cleared 
AV Is set if the conversion causes the floating-point mantissa to be 
shifted left, i.e if the floating-point exponent + scale bias is 
> 157 (127 + 31 - 1) or if the input is ±Infinity, otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if the input operand is a NAN or, when saturation mode is not 
set, either input is an Infinity or the result overflows, otherwise 
cleared 


ALU Floating-Point 

Fn = FLOAT Rx BY Ry / Fn = FLOAT Rx 


Syntax: 

Fn = FLOAT Rx BY Ry 
Fn = FLOAT Rx 


Function: 

Converts the fixed-point operand in Rx to a floating-point result. If a 
scaling factor (Ry) is specified, the fixed-point twos-complement integer in 
Ry is added to the exponent of the floating-point result. The final result is 
placed in register Fn. 


Rounding is to nearest (IEEE) or by truncation, as defined by the rounding 
mode, to a 40-bit boundary, regardless of the values of the rounding 
boundary bits in MODEL The exponent scale bias may cause a floating- 
point overflow or a floating-point underflow. Overflow causes a llnfinity 
(round-to-nearest) or ±NORM.MAX (round-to-zero) to be returned; 
underflow causes a ±Zero to be returned. 


Status flags: 

AZ Is set if the result is a denormal (unbiased exponent < -126) or zero, 
otherwise cleared 

AU Is set if the post-rounded result is a denormal, otherwise cleared 
AN Is set if the floating-point result is negative, otherwise cleared 
AV Is set if the result overflows (unbiased exponent >127) 

AC Is cleared 
AS Is cleared 
AI Is cleared 
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Fn = RECIPS Fx 


Syntax: 

Fn = RECIPS Fx 


Function: 

Creates an 8-bit accurate seed for 1/Fx, the reciprocal of Fx. The mantissa 
of the seed is determined from a ROM table using the 7 MSBs (excluding 
the hidden bit) of the Fx mantissa as an index. The unbiased exponent of 
the seed is calculated as the twos complement of the unbiased Fx 
exponent, decremented by one; i.e., if e is the unbiased exponent of Fx, 
then the unbiased exponent of Fn = -e - 1 . The sign of the seed is the sign 
of the input. ±Zero returns ±Infinity and sets the overflow flag. If the 
unbiased exponent of Fx is greater than +125, the result is ±Zero. A NAN 
input returns an all Is result. 

The following code performs floating-point division using an iterative 
convergence algorithm.* The result is accurate to one LSB in whichever 
format mode, 32-bit or 40-bit, is set (32-bit only for ADSP-21010). The 
following inputs are required: FO=numerator, F12=denominator, FI 1=2.0. 
The quotient is returned in F0. (The two highlighted instructions can be 
removed if only a ±1 LSB accurate single-precision result is necessary.) 


F0=RECIPS F12, F7=F0; 
F12=F0*F12 ; 

F7-F0*F7 , F0=F11-F12; 
F12=F0*F12; 

F7=F0*F7, F0=F11-F12; 


{Get 8 bit seed R0=1/D} 
{D' = D*R0 } 

{F0=R1=2-D*, F7=N*R0 } 
{F12=D ' — D ’ *R1 } 
{F7=N*R0*R1, F0=R2=2-D ' } 


F12=F0*F12 ; {F12=D ' =D ' *R2 } 

F7=F0*F7, F0=F11-F12 ; {F7=N*R0*R1*R2, F0=R3=2-D' } 


F0=F0*F7 ; 


{ F7=N*R0 *R1 *R2 *R3 } 


Note that this code segment can be made into a subroutine by adding an 
rts (db) clause to the third-to-last instruction. 

Status flags: 

AZ Is set if the floating-point result is ±Zero (unbiased exponent of Fx is 
greater than +125), otherwise cleared 
AU Is cleared 

AN Is set if the input operand is negative, otherwise cleared 
AV Is set if the input operand is ±Zero, otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if the input operand is a NAN, otherwise cleared 

* Cavanagh, J. 1984. Digital Computer Arithmetic. McGraw-Hill. Page 284. 


ALU Floating-Point 

Fn = RSQRTS Fx 


Syntax: 


Fn = RSQRTS Fx 


Function: Creates a 4-bit accurate seed for 1/VFx, the reciprocal square root 
of Fx. The mantissa of the seed is determined from a ROM table using the 
LSB of the biased exponent of Fx concatenated with the 6 MSBs (excluding 
the hidden bit) of the mantissa of Fx as an index. The unbiased exponent 
of the seed is calculated as the twos complement of the unbiased Fx 
exponent, shifted right by one bit and decremented by one; i.e., if e is the 
unbiased exponent of Fx, then the unbiased exponent of 
Fn = -INT[e/2] - 1. The sign of the seed is the sign of the input. ±Zero 
returns ±Infinity and sets the overflow flag. +Infinity returns +Zero. A 
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The following code calculates a floating-point reciprocal square root 
(1/Vx) using a Newton-Raphson iteration algorithm.* The result is 
accurate to one LSB in whichever format mode, 32-bit or 40-bit, is set 
(32-bit only for ADSP-21010). To calculate the square root, simply 
multiply the result by the original input. The following inputs are 
required: F0=input, F8=3.0, FI =0.5. The result is returned in F4. (The four 
highlighted instructions can be removed if only a ±1 LSB accurate single- 
precision result is necessary.) 

F4=RSQRTS F0; 

F12=F4*F4; 

F12=F12*F0; 

F4=F1*F4, F12=F8-F12; 

F4=F4*F12; 

F12=F4*F4 ; 

F12=F12*F0; 

F4=F1*F4, F12=F8-F12; 

F4=F4*F12; 

F12=F4*F4; 

F12=F12*F0; 

F4=F1*F4, F12=F8-F12; 

F4=F4*F12; 


{Fetch 4 -bit seed} 
{F12=X0 A 2 } 

{F12=C*X0 A 2 } 

{F4=. 5*X0, F12=3-C*X0 A 2 } 
{F4=X1=.5*X0 (3-C*X0 A 2) } 
{F12=X1 A 2 } 

{F12=C*X1 A 2 } 

{F4=.5*X1, F12=3-C*X1 A 2 } 

{F4=X2= . 5*X1 (3-C*Xl A 2) } 
{F12=X2 A 2 } 

{F12=C*X2 A 2} 

{F4=. 5*X2, F12=3-C*X2 A 2 } 

{F4=X3=.5*X2 (3-C*X2 A 2) } 


Note that this code segment can be made into a subroutine by adding an 
rts (db) clause to the third-to-last instruction. 


Status flags: 

AZ Is set if the floating-point result is +Zero (Fx = +Infinity), otherwise cleared 
AU Is cleared 

AN Is set if the input operand is -Zero, otherwise cleared 

AV Is set if the input operand is ±Zero, otherwise cleared 

AC Is cleared 

AS Is cleared 

AI Is set if the input operand is negative and nonzero, or a NAN, otherwise 
cleared 
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Cavanagh, J. 1984. Digital Computer Arithmetic. McGraw-Hill. Page 278. 



ALU Floating-Point 

Fn = Fx COPYSIGN Fy 



Syntax: 

Fn = Fx COPYSIGN Fy 

Function: 

Copies the sign of the floating-point operand in register Fy to the floating- 
point operand from register Fx without changing the exponent or the 
mantissa. The result is placed in register Fn. A denormal input is flushed 
to ±Zero. A NAN input returns an all Is result. 

Status flags: 

AZ Is set if the floating-point result is ±Zero, otherwise cleared 
AU Is cleared 

AN Is set if the floating-point result is negative, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, otherwise cleared 
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ALU Floating-Point 

Fn = MIN(Fx, Fy) 


Syntax: 

Fn = MIN(Fx, Fy) 

Function: 

Returns the smaller of the floating-point operands in register Fx and Fy. A 
NAN input returns an all Is result. MIN of +Zero and -Zero returns 
-Zero. Denormal inputs are flushed to ±Zero. 

Status flags: 

AZ Is set if the floating-point result is ±Zero, otherwise cleared. 

AU Is cleared 

AN Is set if the floating-point result is negative, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, otherwise cleared 
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ALU Floating-Point D 

Fn = MAX(Fx, Fy) D 


Syntax: 

Fn = MAX(Fx, Fy) 

Function: 

Returns the larger of the floating-point operands in registers Fx and Fy. A 
NAN input returns an all Is result. MAX of +Zero and -Zero returns 
+Zero. Denormal inputs are flushed to dbZero. 

Status flags: 

AZ Is set if the floating-point result is ±Zero, otherwise cleared. 

AU Is cleared 

AN Is set if the floating-point result is negative, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, otherwise cleared 
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ALU Floating-Point 

Fn = CLIP Fx BY Fy 


Syntax: 

Fn = CLIP Fx BY Fy 

Function: 

Returns the floating-point operand in Fx if the absolute value of the 
operand in Fx is less than the absolute value of the floating-point operand 
in Fy. Else, returns I Fy I if Fx is positive, and - 1 Fy I if Fx is negative. A 
NAN input returns an all Is result. Denormal inputs are flushed to ±Zero. 

Status flags: 

AZ Is set if the floating-point result is ±Zero, otherwise cleared. 

AU Is cleared 

AN Is set if the floating-point result is negative, otherwise cleared 
AV Is cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, otherwise cleared 
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B.2.2 Multiplier Operations 

The multiplier operations are described in this section. Table B.3 
summarizes the syntax and opcodes for the fixed-point and floating-point 
multiplier operations. The rest of this section contains detailed 
descriptions of each operation. 

Fixed-point: 


Syntax 


Opcode 

Rn 

= Rx * Ry modi* 

01 yx fOOr 

MRF 

= Rx * Ry modi* 

01 yx flOr 

MRB 

= Rx * Ry modi* 

Olyxfllr 

Rn 

= MRF + Rx * Ry modi* 

lOyx fOOr 

Rn 

= MRB + Rx * Ry modi* 

lOyx fOlr 

MRF 

= MRF + Rx * Ry modi* 

lOyx flOr 

MRB 

= MRB + Rx * Ry modi* 

lOyxfllr 

Rn 

= MRF - Rx * Ry modi* 

IlyxfOOr 

Rn 

= MRB - Rx * Ry modi* 

llyxfOlr 

MRF 

= MRF - Rx * Ry modi* 

llyx flOr 

MRB 

= MRB - Rx * Ry modi* 

llyxfllr 

Rn 

= SAT MRF modi** 

0000 fOOx 

Rn 

= SAT MRB modi** 

0000 fOlx 

MRF 

= SAT MRF modi** 

0000 flOx 

MRB 

= SAT MRB modi** 

0000 fllx 

Rn 

= RND MRF modi** 

0001 lOOx 

Rn 

= RND MRB modi** 

0001 lOlx 

MRF 

= RND MRF modi** 

0001 llOx 

MRB 

= RND MRB modi** 

0001 lllx 

MRF = 

0 

0001 0100 

MRB = 

0 

0001 0110 

MR = 

Rn 


Rn 

MR 


Floating-point: 


Syntax 


Opcode 

Fn = Fx 

* Fy 

0011 0000 

* See Table B.4 y y-input; 1 = 

signed, 0=unsigned 

** See Table B.5 x x-input; 1 = 

signed, 0=unsigned 


f format; l=fractional, 0=integer 
r rounding; l=yes, 0=no 


Table B.3 Multiplier Operations 


B Compute Operations 


Modi in Table B.3 is an optional modifier, enclosed in parentheses, 
consisting of three or four letters that indicate whether the x-input is 
signed (S) or unsigned (U), whether the y-input is signed or unsigned, 
whether the inputs are in integer (I) or fractional (F) format and whether 
the result when written to the register file is to be rounded-to-nearest (R). 
The options for modi and the corresponding opcode values are listed in 
Table B.4. 


Modi 

Opcode 

(SSI) 

-11 0-0 

(SUI) 

-01 0-0 

(USI) 

-10 0-0 

(UUD 

-00 0-0 

(SSF) 

-11 1-0 

(SUF) 

-01 1-0 

(USF) 

-10 1-0 

(UUF) 

-00 1-0 

(SSFR) 

-11 1-1 

(SUFR) 

-01 1-1 

(USFR) 

-10 1-1 

(UUFR) 

-00 1-1 


Table B.4 Multiplier Mod2 Options 

Similarly, modi in Table B.3 is an optional modifier, enclosed in 
parentheses, consisting of two letters that indicate whether the input is 
signed (S) or unsigned (U) and whether the input is in integer (I) or 
fractional (F) format. The options for modi and the corresponding opcode 
values are listed in Table B.5. 


Modi Opcode 

(SI) (for SAT only) — 0-1 

(UI) (for SAT only) -0-0 

(SF) — 1-1 

(UF) — 1-0 


Table B.5 Multiplier Modi Options 
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Multiplier Fixed-Point 

RnIMR = Rx * Ry 


Syntax: 

Rn = Rx * Ry modi 

MRF = Rx * Ry modi 

MRB = Rx * Ry modi 

Function: 

Multiplies the fixed-point fields in registers Rx and Ry. If rounding is 
specified (fractional data only), the result is rounded. The result is placed 
either in the fixed-point field in register Rn or one of the MR accumulation 
registers. If Rn is specified, only the portion of the result that has the same 
format as the inputs is transferred (bits 31-0 for integers, bits 63-32 for 
fractional). The floating-point extension field in Rn is set to all Os. If MRF 
or MRB is specified, the entire 80-bit result is placed in MRF or MRB. 

Status flags: 

MN Is set if the result is negative, otherwise cleared 
MV Is set if the upper bits are not all zeros (signed or unsigned result) or 
ones (signed result). Number of upper bits depends on format. For a 
signed result, fractional=33, integer=49. For an unsigned result, 
fractional=32, integer=48. 

MU Is set if the upper 48 bits of a fractional result are all zeros (signed or 
unsigned result) or ones (signed result) and the lower 32 bits are not 
all zeros. Integer results do not underflow. 

MI Is cleared 
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I multiplier Fixed-Point 


RnIMR = MR + Rx * Ry 


Syntax: 


Rn 

= MRF 

+ Rx * Ry modi 

Rn 

= MRB 

+ Rx * Ry modi 

MRF 

= MRF 

+ Rx * Ry modi 

MRB 

= MRB 

+ Rx * Ry modi 


Function: 

Multiplies the fixed-point fields in registers Rx and Ry, and adds the 
product to the specified MR register value. If rounding is specified 
(fractional data only), the result is rounded. The result is placed either in 
the fixed-point field in register Rn or one of the MR accumulation 
registers, which must be the same MR register that provided the input. If 
Rn is specified, only the portion of the result that has the same format as 
the inputs is transferred (bits 31-0 for integers, bits 63-32 for fractional). 
The floating-point extension field in Rn is set to all Os. If MRF or MRB is 
specified, the entire 80-bit result is placed in MRF or MRB. 


Status flags: 

MN Is set if the result is negative, otherwise cleared 

MV Is set if the upper bits are not all zeros (signed or unsigned result) or 
ones (signed result). Number of upper bits depends on format. For a 
signed result, fractional=33, integer =49. For an unsigned result, 
fractional=32, integer =48. 

MU Is set if the upper 48 bits of a fractional result are all zeros (signed or 
unsigned result) or ones (signed result) and the lower 32 bits are not 
all zeros. Integer results do not underflow. 

MI Is cleared 
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RnlMR = MR-Rx*Ry 


Syntax: 


Rn 

= MRF 

Rn 

= MRB 

MRF 

= MRF 

MRB 

= MRB 


- Rx * Ry mod2 

- Rx * Ry mod2 

- Rx * Ry mod2 

- Rx * Ry mod2 


Function: 

Multiplies the fixed-point fields in registers Rx and Ry, and subtracts the 
product from the specified MR register value. If rounding is specified 
(fractional data only), the result is rounded. The result is placed either in 
the fixed-point field in register Rn or one of the MR accumulation 
registers, which must be the same MR register that provided the input. If 
Rn is specified, only the portion of the result that has the same format as 
the inputs is transferred (bits 31-0 for integers, bits 63-32 for fractional). 
The floating-point extension field in Rn is set to all Os. If MRF or MRB is 
specified, the entire 80-bit result is placed in MRF or MRB. 


Status flags: 

MN Is set if the result is negative, otherwise cleared 

MV Is set if the upper bits are not all zeros (signed or unsigned result) or 
ones (signed result). Number of upper bits depends on format. For a 
signed result, fractional=33, integer=49. For an unsigned result, 
fractional=32, integer=48. 

MU Is set if the upper 48 bits of a fractional result are all zeros (signed or 
unsigned result) or ones (signed result) and the lower 32 bits are not 
all zeros. Integer results do not underflow. 

MI Is cleared 



Multiplier Fixed-Point 

RnIMR = SAT MR 


Syntax: 


Rn 

= SAT MRF 

modi 

Rn 

= SAT MRB 

modi 

MRF 

= SAT MRF 

modi 

MRB 

= SAT MRB 

modi 


Function: 

If the value of the specified MR register is greater than the maximum 
value for the specified data format, the multiplier sets the result to the 
maximum value. Otherwise, the MR value is unaffected. The result is 
placed either in the fixed-point field in register Rn or one of the MR 
accumulation registers, which must be the same MR register that provided 
the input. If Rn is specified, only the portion of the result that has the same 
format as the inputs is transferred (bits 31-0 for integers, bits 63-32 for 
fractional). The floating-point extension field in Rn is set to all Os. If MRF 
or MRB is specified, the entire 80-bit result is placed in MRF or MRB. 

Status flags: 

MN Is set if the result is negative, otherwise cleared 
MV Is cleared 

MU Is set if the upper 48 bits of a fractional result are all zeros (signed or 
unsigned result) or ones (signed result) and the lower 32 bits are not 
all zeros. Integer results do not underflow. 

MI Is cleared 
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Multiplier Fixed-Point 

RnIMR = RND MR 


Syntax: 


Rn 

= RND MRF 

modi 

Rn 

= RND MRB 

modi 

MRF 

= RND MRF 

modi 

MRB 

= RND MRB 

modi 


Function: 

Rounds the specified MR value to nearest at bit 32 (the MR1-MR0 
boundary). The result is placed either in the fixed-point field in register Rn 
or one of the MR accumulation registers, which must be the same MR 
register that provided the input. If Rn is specified, only the portion of the 
result that has the same format as the inputs is transferred (bits 31-0 for 
integers, bits 63-32 for fractional). The floating-point extension field in Rn 
is set to all Os. If MRF or MRB is specified, the entire 80-bit result is placed 
in MRF or MRB. 

Status flags: 

MN Is set if the result is negative, otherwise cleared 
MV Is set if the upper bits are not all zeros (signed or unsigned result) or 
ones (signed result). Number of upper bits depends on format. For a 
signed result, fractional=33, integer =49. For an unsigned result, 
fractional=32, integer=48. 

MU Is set if the upper 48 bits of a fractional result are all zeros (signed or 
unsigned result) or ones (signed result) and the lower 32 bits are not 
all zeros. Integer results do not underflow. 

MI Is cleared 





MR=Rn / Rn=MR 


MR=0 

Syntax: MRF = 0 

MRB = 0 

Function: Sets the value of the specified MR register to zero. All 80 bits (MR2, 
MR1, MRO) are cleared. 

Status flags: 

MN Is cleared 
MV Is cleared 
MU Is cleared 
MI Is cleared 


MR=Rn/Rn=MR 

Function: A transfer to an MR register places the fixed-point field of register Rn in 
the specified MR register. The floating-point extension field in Rn is ignored. A 
transfer from an MR register places the specified MR register in the fixed-point 
field in register Rn. The floating-point extension field in Rn is set to all Os. 


Syntax: MROF = Rn 

MR1F = Rn 
MR2F = Rn 
MROB = Rn 
MR1B = Rn 
MR2B = Rn 


Rn = MROF 
Rn = MR1F 
Rn = MR2F 
Rn = MROB 
Rn = MR1B 
Rn = MR2B 


Compute Field: 


22 

21 20 19 18 17 

16 

15 14 13 12 

11 10 9 8 

7 6 5 4 3 2 1 0 

1 

00000 

T 

AI 

RK 



The MR register is specified by Ai and the data register by Rk. The direction of 
the transfer is determined by T (0=to register file, l=to MR register). 


Ai 

MR Register 

Status flags: 

0000 

MROF 

MN 

Is cleared 

0001 

MR1F 

MV 

Is cleared 

0010 

MR2F 

MU 

Is cleared 

0100 

MROB 

MI 

Is cleared 

0101 

MR1B 



0110 

MR2B 






Multiplier Floating-Point 

Fn = Fx * Fy 



Syntax: 

Fn = Fx * Fy 


Function: 

Multiplies the floating-point operands in registers Fx and Fy. The result is 
placed in the register Fn. 

Status flags: 

MN Is set if the result is negative, otherwise cleared 
MV Is set if the unbiased exponent of the result is greater than 127, 
otherwise cleared 

MU Is set if the unbiased exponent of the result is less than -126, 
otherwise cleared 

MI Is set if either input is a NAN or if the inputs are llnfinity and 
±Zero, otherwise cleared 
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B Compute Operation 


#% 

%l|| 


B.2.3 Shifter Operations 

Shifter operations are described in this section. Table B.6 summarizes the 
syntax and opcodes for the shifter operations. The rest of this section 
contains detailed descriptions of each operation. 


The shifter operates on the register file's 32-bit fixed-point fields (bits 39- 
8). Two-input shifter operations can take their y-input from the register 
file or from immediate data provided in the instruction. Either form uses 
the same opcode. However, the latter case, called an immediate shift or 
shifter immediate operation, is allowed only with instruction type 6, 
which has an immediate data field in its opcode for this purpose. All other 
instruction types must obtain the y-input from the register file when the 
compute operation is a two-input shifter operation. 


Syntax 


Opcode 


Rn = LSHIFT Rx BY Ry I <data8> 0000 0000 

Rn = Rn OR LSHIFT Rx BY Ry I <data8> 0010 0000 

Rn = ASHIFT Rx BY Ry I <data8> 0000 0100 

Rn = Rn OR ASHIFT Rx BY Ry I <data8> 0010 0100 

Rn = ROT Rx BY RY I <data8> 0000 1000 


Rn = BCLR Rx BY Ry I <data8> 1 100 0100 

Rn = BSET Rx BY Ry I <data8> 1100 0000 

Rn = BTGL Rx BY Ry I <data8> 1100 1000 

BTST Rx BY Ry I <data8> 1 100 1 100 


Rn = FDEP Rx BY Ry I <bit6>:<len6> 0100 0100 

Rn = Rn OR FDEP Rx BY Ry I <bit6>:<len6> 0110 0100 

Rn = FDEP Rx BY Ry I <bit6>:<len6> (SE) 0100 1100 

Rn = Rn OR FDEP Rx BY Ry I <bit6>:<len6> (SE) 0110 1100 

Rn = FEXT Rx BY Ry I <bit6>:<len6> 01 00 0000 

Rn = FEXT Rx BY Ry I <bit6>:<len6> (SE) 0100 1000 


Rn = EXP Rx 
Rn = EXP Rx (EX) 
Rn = LEFTZ Rx 
Rn = LEFTO Rx 


1000 0000 
1000 0100 
1000 1000 
1000 1100 


Instruction modifiers: 

(SE) Sign extension of deposited or extracted field 
(EX) Extended exponent extract 


Table B.6 Shifter Operations 
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Shifter 

Rn = LSHIFT Rx BY Ryl<data8> 



Syntax: 

Rn = LSHIFT Rx BY Ry 
Rn = LSHIFT Rx BY <data8> 

Function: 

Logically shifts the fixed-point operand in register Rx by the 32-bit value 
in register Ry or by the 8-bit immediate value in the instruction. The 
shifted result is placed in the fixed-point field of register Rn. The floating- 
point extension field of Rn is set to all Os. The shift values are twos- 
complement numbers. Positive values select a left shift, negative values 
select a right shift. The 8-bit immediate data can take values between -128 
and 127 inclusive, allowing for a shift of a 32-bit field from off-scale right 
to off-scale left. 

Status flags: 

SZ Is set if the shifted result is zero, otherwise cleared 

SV Is set if the input is shifted to the left by more than 0, otherwise 

cleared 

SS Is cleared 
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Shifter 

Rn = Rn OR LSHIFT Rx BY Ryl<data8> 


Syntax: 

Rn = Rn OR LSHIFT Rx BY Ry 
Rn = Rn OR LSHIFT Rx BY <data8> 

Function: 

Logically shifts the fixed-point operand in register Rx by the 32-bit value 
in register Ry or by the 8-bit immediate value in the instruction. The 
shifted result is logically ORed with the fixed-point field of register Rn 
and then written back to register Rn. The floating-point extension field of 
Rn is set to all Os. The shift values are twos-complement numbers. Positive 
values select a left shift, negative values select a right shift. The 8-bit 
immediate data can take values between -128 and 127 inclusive, allowing 
for a shift of a 32-bit field from off-scale right to off-scale left. 

Status flags: 

SZ Is set if the shifted result is zero, otherwise cleared 

SV Is set if the input is shifted left by more than 0, otherwise cleared 

SS Is cleared 


B-56 



Shifter 

Rn = ASHIFT Rx BY Ryl<data8> 


Syntax: 

Rn = ASHIFT Rx BY Ry 
Rn = ASHIFT Rx BY <data8> 

Function: 

Arithmetically shifts the fixed-point operand in register Rx by the 32-bit 
value in register Ry or by the 8-bit immediate value in the instruction. The 
shifted result is placed in the fixed-point field of register Rn. The floating- 
point extension field of Rn is set to all Os. The shift values are twos- 
complement numbers. Positive values select a left shift, negative values 
select a right shift. The 8-bit immediate data can take values between -128 
and 127 inclusive, allowing for a shift of a 32-bit field from off-scale right 
to off-scale left. 

Status flags: 

SZ Is set if the shifted result is zero , otherwise cleared 

SV Is set if the input is shifted left by more than 0, otherwise cleared 

SS Is cleared 




Shifter 

Rn = Rn OR ASHIFT Rx BY Ryl<data8> 


Syntax: 

Rn = Rn OR ASHIFT Rx BY Ry 
Rn = Rn OR ASHIFT Rx BY <data8> 

Function: 

Arithmetically shifts the fixed-point operand in register Rx by the 32-bit 
value in register Ry or by the 8-bit immediate value in the instruction. The 
shifted result is logically ORed with the fixed-point field of register Rn 
and then written back to register Rn. The floating-point extension field of 
Rn is set to all Os. The shift values are twos-complement numbers. Positive 
values select a left shift, negative values select a right shift. The 8-bit 
immediate data can take values between -128 and 127 inclusive, allowing 
for a shift of a 32-bit field from off-scale right to off-scale left. 

Status flags: 

SZ Is set if the shifted result is zero, otherwise cleared 

SV Is set if the input is shifted left by more than 0, otherwise cleared 

SS Is cleared 
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Shifter 

Rn = ROT Rx BY Ryl<data8> 



Syntax: 

Rn = ROT Rx BY Ry 
Rn = ROT Rx BY <data8> 

Function: 

Rotates the fixed-point operand in register Rx by the 32-bit value in 
register Ry or by the 8-bit immediate value in the instruction. The rotated 
result is placed in the fixed-point field of register Rn. The floating-point 
extension field of Rn is set to all Os. The shift values are twos-complement 
numbers. Positive values select a rotate left; negative values select a rotate 
right. The 8-bit immediate data can take values between -128 and 127 
inclusive, allowing for a rotate of a 32-bit field from full right wrap around 
to full left wrap around. 

Status flags: 

SZ Is set if the rotated result is zero, otherwise cleared 
SV Is cleared 

SS Is cleared 
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D Shifter 

° Rn = BCLR Rx BY Ryl<data8> 


Syntax: 

Rn = BCLR Rx BY Ry 
Rn = BCLR Rx BY <data8> 

Function: 

Clears a bit in the fixed-point operand in register Rx. The result is placed 
in the fixed-point field of register Rn. The floating-point extension field of 
Rn is set to all Os. The position of the bit is the 32-bit value in register Ry or 
the 8-bit immediate value in the instruction. The 8-bit immediate data can 
take values between 31 and 0 inclusive, allowing for any bit within a 32-bit 
field to be cleared. If the bit position value is greater than 31 or less than 0, 
no bits are cleared. 

Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 
SV Is set if the bit position is greater than 31, otherwise cleared 
SS Is cleared 

Note: This compute operation affects a bit in a register file location. There 
is also a bit manipulation instruction that affects one or more bits in a 
system register. This BIT CLR instruction should not be confused with the 
BCLR shifter operation. See Appendix E for more information on BIT 
CLR. 
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Shifter 

Rn = BSET Rx BY Ryl<data8> 



Syntax: 

Rn = BSET Rx BY Ry 
Rn = BSET Rx BY <data8> 

Function: 

Sets a bit in the fixed-point operand in register Rx. The result is placed in 
the fixed-point field of register Rn. The floating-point extension field of Rn 
is set to all Os. The position of the bit is the 32-bit value in register Ry or 
the 8-bit immediate value in the instruction. The 8-bit immediate data can 
take values between 31 and 0 inclusive, allowing for any bit within a 32-bit 
field to be set. If the bit position value is greater than 31 or less than 0, no 
bits are set. 

Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 
SV Is set if the bit position is greater than 31, otherwise cleared 
SS Is cleared 

Note: This compute operation affects a bit in a register file location. There 
is also a bit manipulation instruction that affects one or more bits in a 
system register. This BIT SET instruction should not be confused with the 
BSET shifter operation. See Appendix E for more information on BIT SET. 
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Shifter 

Rn = BTGL Rx BY Ryl<data8> 


Syntax: 

Rn = BTGL Rx BY Ry 
Rn = BTGL Rx BY <data8> 

Function: 

Toggles a bit in the fixed-point operand in register Rx. The result is placed 
in the fixed-point field of register Rn. The floating-point extension field of 
Rn is set to all Os. The position of the bit is the 32-bit value in register Ry or 
the 8-bit immediate value in the instruction. The 8-bit immediate data can 
take values between 31 and 0 inclusive, allowing for any bit within a 32-bit 
field to be toggled. If the bit position value is greater than 31 or less than 0, 
no bits are toggled. 

Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 
SV Is set if the bit position is greater than 31, otherwise cleared 
SS Is cleared 

Note: This compute operation affects a bit in a register file location. There 
is also a bit manipulation instruction that affects one or more bits in a 
system register. This BIT TGL instruction should not be confused with the 
BTGL shifter operation. See Appendix E for more information on BIT 
TGL. 
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Shifter 

BTST Rx BY Ryl<data8> 



Syntax: 

BTST Rx BY Ry 
BTST Rx BY <data8> 

Function: 

Tests a bit in the fixed-point operand in register Rx. The SZ flag is set if the 
bit is a 0 and cleared if the bit is a 1. The position of the bit is the 32-bit 
value in register Ry or the 8-bit immediate value in the instruction. The 
8-bit immediate data can take values between 31 and 0 inclusive, allowing 
for any bit within a 32-bit field to be tested. If the bit position value is 
greater than 31 or less than 0, no bits are tested. 

Status flags: 

SZ Is cleared if the tested bit is a 1, is set if the tested bit is a 0 or if the 
bit position is greater than 31 

SV Is set if the bit position is greater than 31, otherwise cleared 
SS Is cleared 

Note: This compute operation tests a bit in a register file location. There is 
also a bit manipulation instruction that tests one or more bits in a system 
register. This BIT TST instruction should not be confused with the BTST 
shifter operation. See Appendix E for more information on BIT TST. 
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Shifter 

Rn = FDEP Rx BY Ryl<bit6>:<len6> 


Syntax: 

Rn = FDEP Rx BY Ry 

Rn = FDEP Rx BY <bit6>:<len6> 

Function: 

Deposits a field from register Rx to register Rn. The input field is right-aligned 
within the fixed-point field of Rx. Its length is determined by the len6 field in 
register Ry or by the immediate len6 field in the instruction. The field is 
deposited in the fixed-point field of Rn, starting from a bit position determined 
by the bit6 field in register Ry or by the immediate bit6 field in the instruction. 
Bits to the left and to the right of the deposited field are set to 0. The floating-pt. 
extension field of Rn (bits 7-0 of the 40-bit word) is set to all Os. Bit6 and len6 can 
take values between 0 and 63 inclusive, allowing for deposit of fields ranging in 
length from 0 to 32 "bits, and to "bit positions ranging from 0 to off-scale left. 

39 19 13 7 0 

R y [ m— Hg .ene I b.«6 


39 
Rx 

Ien6 = number of bits to take from Rx, starting from LSB of 32-bit field 



39 

Rn p" " \ deposit field 

bit6 = starting bit position for deposit, 
referenced from LSB of 32-bit field 





bit6 


reference point 


0 


Example: If len6=14 and bit6=13, then the 14 bits of Rx are deposited in Rn bits 
34-21 (of the 40-bit word). 

39 31 23 15 7 0 

I I | — abcdef | ghi jklmn | I Rx 

\ / 

14 bits 


39 31 23 15 7 0 

I OOOOOabc | defghi jk | lmnOOOOO | 00000000 | 00000000 | Rn 

\ / 

I 

bit position 13 (from reference point) 


Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 

SV Is set if any bits are deposited to the left of the 32-bit fixed-point output 

field (i.e., if len6 + bit6 > 32), otherwise cleared 
B-64 SS Is cleared 






Rn = Rn OR 


Shifter 

FDEP Rx BY Ryl<bit6>:<len6> 



Syntax: 

Rn = Rn OR FDEP Rx BY Ry 

Rn = Rn OR FDEP Rx BY <bit6>:<len6> 

Function: 

Deposits a field from register Rx to register Rn. The field value is logically ORed 
bitwise with the specified field of register Rn and the new value is written back 
to register Rn. The input field is right-aligned within the fixed-point field of Rx. 
Its length is determined by the len6 field in register Ry or by the immediate len6 
field in the instruction. The field is deposited in the fixed-point field of Rn, 
starting from a bit position determined by the bit6 field in register Ry or by the 
immediate bit6 field in the instruction. Bit6 and len6 can take values between 0 
and 63 inclusive, allowing for deposit of fields ranging in length from 0 to 32 bits, 
and to bit positions ranging from 0 to off-scale left. 

Example: 

39 31 23 15 7 0 

| | | --abcdef | ghi jklmn | I Rx 

\ / 

len6 bits 


39 31 23 15 7 0 

| abcdefgh | i jklmnop | qrstuvwx | yzabcdef | ghi jklmn | Rn old 

\ / 

I 

bit position bit6 (from reference point) 
39 31 23 15 7 0 


I abcde opq I rstuvwxy I zab tuvwx I yzabcdef 1 ghi jklmn I Rn new 

I 

OR result 


B 


Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 

SV Is set if any bits are deposited to the left of the 32-bit fixed-point output 

field (i.e., if len6 + bit6 > 32), otherwise cleared 
SS Is cleared 
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Shifter 

Rn = FDEP Rx BY Ryl<bit6>:<len6> (SE) 


Syntax: 

Rn = FDEP Rx BY Ry (SE) 

Rn = FDEP Rx BY <bit6>:<len6> (SE) 


Function: 

Deposits and sign-extends a field from register Rx to register Rn. The input field 
is right-aligned within the fixed-point field of Rx. Its length is determined by the 
len6 field in register Ry or by the immediate len6 field in the instruction. The 
field is deposited in the fixed-point field of Rn, starting from a bit position 
determined by the bit6 field in register Ry or by the immediate bit6 field in the 
instruction. The MSBs of Rn are sign-extended by the MSB of the deposited field, 
unless the MSB of the deposited field is off-scale left. Bits to the right of the 
deposited field are set to 0. The floating-point extension field of Rn (bits 7-0 of the 
40-bit word) is set to all Os. Bit6 and Ien6 can take values between 0 and 63 
inclusive, allowing for deposit of fields ranging in length from 0 to 32 bits into bit 
positions ranging from 0 to off-scale left. 


39 

19 13 

Ry 1 

i i 

| Ien6 | bit6 | 

39 

Rx| 

1 

Ien6 = number of bits to take from Rx, starting from LSB of 32-bit field 

39 


Rn | sign bit extension | deposit field 


i 

bit6 = starting bit position for deposit, 
referenced from LSB of 32-bit field 



bit6 


reference point 


Example: 

39 31 23 15 7 0 

| | | — abcdef | ghi jklmn | I Rx 

\ / 

len6 bits 


39 31 23 15 7 0 

| aaaaaabc | defghi jk | lmnOOOOO \ 00000000 \ 00000000 \ Rn 

\ /\ / 

sign I 

extension bit position bit6 (from reference point) 

Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 

SV Is set if any bits are deposited to the left of the 32-bit fixed-point output 

field (i.e., if len6 + bit6 > 32), otherwise cleared 
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Shifter D 

Rn = Rn OR FDEP Rx BY Ryl<bit6>:<len6> (SE) ° 


Syntax: 

Rn = Rn OR FDEP Rx BY Ry (SE) 

Rn = Rn OR FDEP Rx BY <bit6>:<len6> (SE) 

Function: 

Deposits and sign-extends a field from register Rx to register Rn. The sign- 
extended field value is logically ORed bitwise with the value of register Rn and 
the new value is written back to register Rn. The input field is right-aligned 
within the fixed-point field of Rx. Its length is determined by the len6 field in 
register Ry or by the immediate len6 field in the instruction. The field is 
deposited in the fixed-point field of Rn, starting from a bit position determined 
by the bit6 field in register Ry or by the immediate bit6 field in the instruction. 
Bit6 and len6 can take values between 0 and 63 inclusive, allowing for deposit of 
fields ranging in length from 0 to 32 bits into bit positions ranging from 0 to off- 
scale left. 

Example: 

39 31 


39 31 23 15 7 0 

| aaaaaabc | def ghi jk | lmnOOOOO | 000000001 00000000 \ 

\ /\ / 

sign | 

extension bit position bit6 (from reference point) 

39 31 23 15 7 0 

| abcdef gh | i jklmnop | qrstuvwx | yzabcdef | ghi jklmn | Rn old 

39 31 23 15 7 0 

I vwxyzabc I def ghi jk I lmn tuvwx 1 yzabcdef I ghi jklmn I Rn new 

I 

OR result 

Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 

SV Is set if any bits are deposited to the left of the 32-bit fixed-point output 

field (i.e., if len6 + bit6 > 32), otherwise cleared 
SS Is cleared 


23 15 7 0 

| --abcdef | ghi jklmn | | Rx 

\ / 

len6 bits 
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Rn = FEXT Rx BY Ryl<bit6>:<len6> 


Syntax: 

Rn = FEXT Rx BY Ry 

Rn = FEXT Rx BY <bit6>:<len6> 

Function: 

Extracts a field from register Rx to register Rn. The output field is placed right- 
aligned in the fixed-point field of Rn. Its length is determined by the len6 field in 
register Ry or by the immediate len6 field in the instruction. The field is extracted 
from the fixed-point field of Rx starting from a bit position determined by the 
bit6 field in register Ry or by the immediate bit6 field in the instruction. Bits to 
the left of the extracted field are set to 0 in register Rn. The floating-point 
extension field of Rn (bits 7-0 of the 40-bit word) is set to all Os. Bit6 and len6 can 
take values between 0 and 63 inclusive, allowing for extraction of fields ranging 
in length from 0 to 32 bits, and from bit positions ranging from 0 to off-scale left. 


39 


19 

13 

Ry| 


: .| Ien6 

l bit6 | 

39 

Rx | | extract field | 


i 

bit6 = starting bit position for extract, 

referenced from LSB of 32-bit field 

i i 



bit6 


39 


reference point 
7 


Rn 


extracted bits placed in Rn, starting at LSB of 32-bit field 


Example: 

39 31 23 15 7 0 

I abc | defghi jk | lmn I I I Rx 

\ / 

len6 bits | 

bit position bit6 (from reference point) 

39 31 23 15 7 0 

I 00000000 | 00000000 | OOabcdef Ighijklmnl 00000000 | Rn 


Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 

SV Is set if any bits are extracted from the left of the 32-bit fixed-point, input 

field (i.e., if len6 + bit6 > 32), otherwise cleared 
SS Is cleared 






Shifter 

Rn = FEXT Rx BY Ryl<bit6>:<len6> (SE) 



Syntax: 

Rn = FEXT Rx BY Ry (SE) 

Rn = FEXT Rx BY <bit6>:<len6> (SE) 

Function: 

Extracts and sign-extends a field from register Rx to register Rn. The output field 
is placed right-aligned in the fixed-point field of Rn. Its length is determined by 
the len6 field in register Ry or by the immediate len6 field in the instruction. The 
field is extracted from the fixed-point field of Rx starting from a bit position 
determined by the bit6 field in register Ry or by the immediate bit6 field in the 
instruction. The MSBs of Rn are sign-extended by the MSB of the extracted field, 
unless the MSB is extracted from off-scale left. The floating-point extension field 
of Rn(bits 7-0 of the 40-bit word) is set to all Os. Bit6 and len6 can take values 
between 0 and 63 inclusive, allowing for extraction of fields ranging in length 
from 0 to 32 bits and from bit positions ranging from 0 to off-scale left. 

Example: 

39 31 23 15 7 0 

| abc | defghi jk | lmn I I I Rx 

\ / 

len6 bits | 

bit position bit6 (from reference point) 

39 31 23 15 7 0 

| aaaaaaaa | aaaaaaaa | aaabcdef | ghi jklmn | 00000000 | Rn 

\ / 

sign extension 


Status flags: 

SZ Is set if the output operand is 0, otherwise cleared 

SV Is set if any bits are extracted from the left of the 32-bit fixed-point input 

field (i.e., if len6 + bit6 > 32), otherwise cleared 
SS Is cleared 
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Shifter 

Rn = EXP Rx 


Syntax: 

Rn = EXP Rx 

Function: 

Extracts the exponent of the fixed-point operand in Rx. The exponent is 
placed in the shf8 field in register Rn. The exponent is calculated as the 
twos complement of: 

# leading sign bits in Rx - 1 

Status flags: 

SZ Is set if the extracted exponent is 0, otherwise cleared 
SV Is cleared 

SS Is set if the fixed-point operand in Rx is negative (bit 31 is a 1), 
otherwise cleared 
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Shifter 

Rn = EXP Rx (EX) 



Syntax: 

Rn = EXP Rx (EX) 

Function: 

Extracts the exponent of the fixed-point operand in Rx, assuming that the 
operand is the result of an ALU operation. The exponent is placed in the 
shf8 field in register Rn. If the AV status bit is set, a value of +1 is placed 
in the shf8 field to indicate an extra bit (the ALU overflow bit). If the AV 
status bit is not set, the exponent is calculated as the twos complement of: 

# leading sign bits in Rx - 1 

Status flags: 

SZ Is set if the extracted exponent is 0, otherwise cleared 
SV Is cleared 

SS Is set if the exclusive OR of the AV status bit and the sign bit (bit 31) 
of the fixed-point operand in Rx is equal to 1, otherwise cleared 
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Shifter 

Rn = LEFTZ Rx 


Syntax: 

Rn = LEFTZ Rx 

Function: 

Extracts the number of leading Os from the fixed-point operand in Rx. The 
extracted number is placed in the bit6 field in Rn. 

Status flags: 

SZ Is set if the MSB of Rx is 1, otherwise cleared 
SV Is set if the result is 32, otherwise cleared 
SS Is cleared 
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Shifter 

Rn = LEFTO Rx 



Syntax: 

Rn = LEFTO Rx 

Function: 

Extracts the number of leading Is from the fixed-point operand in Rx. The 
extracted number is placed in the bit6 field in Rn. 

Status flags: 

SZ Is set if the MSB of Rx is 0, otherwise cleared 
SV Is set if the result is 32, otherwise cleared 
SS Is cleared 
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B.2.4 Multifunction Computations 

Multifunction computations are of three types, each of which has a 
different format for the 23-bit compute field: 

• Dual add / subtract 

• Parallel multiplier /ALU 

• Parallel multiplier and add /subtract 



Multifunction 

Dual Add/Subtract (Fixed-Pt.) 



The dual add /subtract operation computes the sum and the difference of 
two inputs and returns the two results to different registers. There are 
fixed-point and floating-point versions of this operation. 

Fixed-Point: 

Syntax: 


Ra = Rx + Ry, Rs = Rx - Ry 

Compute Field: 


22 

21 20 

19 18 17 16 

15 14 13 12 

11 10 9 8 

7 6 5 4 

3 2 10 

0 

00 

0 111 

RS 

RA 

RX 

RY 


Function: 

Does a dual add/ subtract of the fixed-point fields in registers Rx and Ry. 
The sum is placed in the fixed-point field of register Ra and the difference 
in the fixed-point field of Rs. The floating-point extension fields of Ra and 
Rs are set to all Os. In saturation mode (the ALU saturation mode bit in 
MODE1 set) positive overflows return the maximum positive number 
(0x7FFF FFFF), and negative overflows return the minimum negative 
number (0x8000 0000). 

Status flags: 

AZ Is set if either of the fixed-point outputs is all 0s, otherwise cleared 
AU Is cleared 

AN Is set if the most significant output bit is 1 of either of the outputs, 
otherwise cleared 

AV Is set if the XOR of the carries of the two most significant adder 
stages of either of the outputs is 1, otherwise cleared 
AC Is set if the carry from the most significant adder stage of either of 
the outputs is 1, otherwise cleared 
AS Is cleared 
AI Is cleared 
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Multifunction 

Dual Add/Subtract (Floating-Pt.) 


Floating-Point: 

Syntax: 

Fa = Fx + Fy, Fs = Fx - Fy 

Compute Field: 


22 

21 20 

19 18 17 16 

15 14 13 12 

11 10 9 8 

7 6 5 4 

3 2 10 

0 

00 

1111 

FS 

FA 

FX 

FY 


Function: 

Does a dual add /subtract of the floating-point operands in registers Fx 
and Fy. The normalized results are placed in registers Fa and Fs: the sum 
in Fa and the difference in Fs. Rounding is to nearest (IEEE) or by 
truncation, to a 32-bit or to a 40-bit boundary, as defined by the rounding 
mode and rounding boundary bits in MODEL Post-rounded overflow 
returns ±Infinity (round-to-nearest) or ±NORM.MAX (round-to-zero). 
Post-rounded denormal returns ±Zero. Denormal inputs are flushed to 
±Zero. A NAN input returns an all Is result. 

Status flags: 

AZ Is set if either of the post-rounded results is a denormal (unbiased 
exponent < -126) or zero, otherwise cleared 
AU Is set if either post-rounded result is a denormal, otherwise cleared 
AN Is set if either of the floating-point results is negative, otherwise 
cleared 

AV Is set if either of the post-rounded results overflows (unbiased 
exponent > +127), otherwise cleared 
AC Is cleared 
AS Is cleared 

AI Is set if either of the input operands is a NAN, or if both of the input 
operands are Infinities, otherwise cleared 
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Multifunction 

Parallel Multiplier & ALU (Fixed-Pt.) 



The parallel multiplier/ ALU operation performs a multiply or 
multiply/ accumulate and one of the following ALU operations: add, 
subtract, average, fixed-point to floating-point or floating-point to fixed- 
point conversion, or floating-point ABS, MIN or MAX. 

For detailed information about a particular operation, see the individual 
descriptions under Single-Function Operations. 

Fixed-Point: 

Syntax: See Table B.7 

Compute Field: 


22 

21 20 19 18 17 16 

15 14 13 12 

11 10 9 8 

7 6 

5 4 

3 2 

1 0 





R 

R 

R 

R 

1 

OPCODE 

RM 

RA 

X 

Y 

X 

Y 





M 

M 

A 

A 
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Multifunction 

Parallel Multiplier & ALU (Floating-Pt.) 


Floating-Point: 

Syntax: See Table B.7 

Compute Field: 


22 

21 20 19 18 17 16 

15 14 13 12 

11 10 9 8 

7 6 

5 4 

3 2 

1 0 





F 

F 

F 

F 

1 

OPCODE 

FM 

FA 

X 

Y 

X 

Y 





M 

M 

A 

A 


The multiplier and ALU operations are determined by OPCODE. The 
selections for the 6-bit OPCODE field are listed in Table B.7. The 
multiplier x- and y-operands are received from data registers RXM (FXM) 
and RYM (FYM). The multiplier result operand is returned to data 
register RM (FM). The ALU x- and y-operands are received from data 
registers RXA (FXA) and RYA (FYA). The ALU result operand is returned 
to data register RA (FA). 

The result operands can be returned to any registers within the register 
file. Each of the four input operands is restricted to a particular set of four 
data registers. 

Input 

Multiplier X: 

Multiplier Y: 

ALU X: 

ALU Y: 


A X 


X 


Allowed Sources 
R3-R0 (F3-F0) 
R7-R4 (F7-F4) 
R11-R8 (F11-F8) 
R15-R12 (F15-F12) 
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Syntax Opcode 

Rm=R3-0 * R7-4 (SSFR), Ra=Rl 1-8 + R15-12 000100 

Rm=R3-0 * R7-4 (SSFR), Ra=Rll-8 - R15-12 000101 

Rm=R3-0 * R7-4 (SSFR), Ra=(Rll-8 + R15-12)/2 000110 

MRF=MRF + R3-0 * R7-4 (SSF), Ra=Rl 1-8 + R15-12 001000 

MRF=MRF + R3-0 * R7-4 (SSF), Ra=Rll-8 - R15-12 001001 

MRF=MRF + R3-0 * R7-4 (SSF), Ra=(Rll-8 + R15-12)/2 001010 

Rm=MRF + R3-0 * R7-4 (SSFR), Ra=Rl 1-8 + R15-12 001100 

Rm=MRF + R3-0 * R7-4 (SSFR), Ra=Rll-8 - R15-12 001101 

Rm=MRF + R3-0 * R7-4 (SSFR), Ra=(Rll-8 + R15-12)/2 001110 

MRF=MRF - R3-0 * R7-4 (SSF), Ra=Rl 1-8 + R15-12 010000 

MRF=MRF - R3-0 * R7-4 (SSF), Ra=Rll-8 - R15-12 010001 

MRF=MRF - R3-0 - R7-4 (SSF), Ra=(Rll-8 + R15-12)/2 010010 

Rm=MRF - R3-0 * R7-4 (SSFR), Ra=Rl 1-8 + R15-12 010100 

Rm=MRF - R3-0 * R7-4 (SSFR), Ra=Rll-8 - R15-12 010101 

Rm=MRF - R3-0 * R7-4 (SSFR), Ra=(Rll-8 + R15-12)/2 010110 


Fm=F3-0 * F7-4, Fa=Fll-8 + FI 5-12 
Fm=F3-0 * F7-4, Fa=Fll-8 - F15-12 
Fm=F3-0 * F7-4, Fa=FLOAT Rll-8 by R15-12 
Fm=F3-0 4 F7-4, Fa=FIX Rll-8 by R15-12 
Fm=F3-0 * F7-4, Fa=(Fll-8 + F15-12)/2 
Fm=F3-0 * F7-4, Fa=ABS FI 1-8 
Fm=F3-0 * F7-4, Fa=MAX (Fll-8, F15-12) 
Fm=F3-0 * F7-4, Fa=MIN (Fll-8, F15-12) 

Table B.7 Parallel Multiplier/ALU Computations 


011000 

011001 

011010 

011011 

011100 

011101 

011110 

011111 
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Multifunction 

Parallel Multiplier & Dual Add/Subtract 


The parallel multiplier and dual add/ subtract operation performs a 
multiply or multiply /accumulate and computes the sum and the 
difference of the ALU inputs. For detailed information on the multiplier 
operations, see the individual descriptions under Single-Function 
Operations. For information on the dual add/ subtract operation, see the 
Dual Add/Subtract section. 

Fixed-Point: 

Syntax: 

Rm=R3-0 * R7-4 (SSFR), Ra=Rll-8 + R15-12, Rs=Rll-8 - R15-12 

Compute Field: 


22 

21 20 

19 18 17 16 

15 14 13 12 

11 10 9 8 

7 6 

5 4 

3 2 

1 0 






R 

R 

R 

R 

1 

1 0 

RS 

RM 

RA 

X 

Y 

X 

Y 






M 

M 

A 

A 


Floating-Point: 

Syntax: 

Fm=F3-0 * F7-4, Fa=Fll-8 + F15-12, Fs=Fll-8 - F15-12 

Compute Field: 


22 

21 20 

19 18 17 16 

15 14 13 12 

11 10 9 8 

7 6 

5 4 

3 2 

1 0 






F 

F 

F 

F 

1 

1 1 

FS 

FM 

FA 

X 

Y 

X 

Y 






M 

M 

A 

A 


The multiplier x- and y-operands are received from data registers RXM 
(FXM) and RYM (FYM). The multiplier result operand is returned to data 
register RM (FM). The ALU x- and y-operands are received from data 
registers RXA (FXA) and RYA (FYA). The ALU result operands are 
returned to data register RA (FA) and RS (FS). 
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Multifunction 

Parallel Multiplier & Dual Add/Subtract 


The result operands can be returned to any registers within the register 
file. Each of the four input operands is restricted to a different set of four 
data registers. 


Input 

Multiplier X: 
Multiplier Y: 
ALUX: 

ALU Y: 


Allowed Sources 
R3-R0 (F3-F0) 
R7-R4 (F7-F4) 
R11-R8 (F11-F8) 
R15-R12 (F15-F12) 




IEEE 1149.1 JTAG O C 
Test Access Port 


C.1 OVERVIEW 

A boundary scan allows a system designer to test interconnections on a 
printed circuit board with minimal test-specific hardware. The scan is 
made possible by the ability to control and monitor each input and output 
pin on each chip through a set of serially scannable latches. Each input 
and output is connected to a latch, and the latches are connected as a long 
shift register so that data can be read from or written to them through a 
serial test access port (TAP). The ADSP-21 020/21 010 contains a test access 
port compatible with the industry standard IEEE 1149.1 (JTAG) 
specification. 

Only the IEEE 1149.1 features specific to the ADSP-21020/21010 are 
described here. For more information, see the IEEE 1149.1 specification. 

The boundary scan allows a variety of functions to be performed on each 
input and output signal of the ADSP-21020/21010. Each input has a latch 
that monitors the value of the incoming signal and can also drive data into 
the chip in place of the incoming value. Similarly, each output has a latch 
that monitors the outgoing signal and can also drive the output in place of 
the outgoing value. For bidirectional pins, the combination of input and 
output functions is available. 

Every latch associated with a pin is part of a single serial shift register 
path. Each latch is a master /slave type latch with the controlling clock 
provided externally. This clock (TCK) is asynchronous to the ADSP- 
21020/21010 system clock (CLKIN). 


c-i 





C.2 TEST ACCESS PORT 

The test access port of the ADSP-21020/21010 controls the operation of the 
boundary scan. The TAP consists of five pins that control a state machine, 
including the boundary scan. The state machine and pins conform to the 
IEEE 1149.1 specification. 


TCK (Input) 

Test Clock. Used to clock serial data into scan latches 
and control sequencing of the test state machine. TCK 
can be asynchronous with CLKIN. 

TMS (Input) 

Test Mode Select. Primary control signal for the state 
machine. Synchronous with TCK. A sequence of 
values on TMS adjusts the current state of the TAP. 

TDI (Input) 

Test Data Input. Serial input data to the scan latches. 
Synchronous with TCK. 

TDO (Output) 

Test Data Output. Serial output data from the scan 
latches. Synchronous with TCK. 

TRST (Input) 

Test Reset. Resets the test state machine. Can be 
asynchronous with TCK. 


C.3 INSTRUCTION REGISTER 

The instruction register allows an instruction to be shifted into the 
processor. This instruction selects the test to be performed and/ or the test 
data register to be accessed. The instruction register is 4 bits long with no 
parity bit. A value of 0001 binary is loaded (LSB nearest TDO) into the 
instruction register whenever the TAP reset state is entered. 

Table C.l lists the binary code for each instruction. Bit 1 is nearest TDI and 
bit 4 is nearest TDO. An x specifies a don't-care state. No data registers are 
placed into test modes by any of the public instructions. The instructions 
affect the ADSP-21020/21010 as defined in the 1149.1 specification. The 
optional instructions RUNBIST, IDCODE and USERCODE are not 
supported by the ADSP-21020/21010. 
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Instruction 


Bits 

1234 

Name 

Register 
(Serial Path) 

Type 

X X X 1 

BYPASS 

Bypass 

Public 

0000 

EXTEST 

Boundary 

Public 

1000 

SAMPLE/PRELOAD 

Boundary 

Public 

1100 

INTEST 

Boundary 

Public 

0100 

Reserved for emulation 


Private 

x x 1 0 

Reserved for emulation 


Private 


Table C.1 Test Instructions 


The entry under "Register" is the serial scan path, either Boundary or 
Bypass in this case, enabled by the instruction. Figure C.l shows these 
register paths. The 1-bit Bypass register is fully defined in the 1149.1 
specification. The Boundary register is described in the next section. 

No special values need be written into any register prior to selection of 
any instruction. As Table C.l shows, certain instructions are reserved for 
emulator use. See section C.7 for more information. 



Figure C.l Serial Scan Paths 


Instruction Register 



C.4 BOUNDARY REGISTER 

The Boundary register is 286 bits long. This section defines the latch type 
and function of each position in the scan path. The positions are 
numbered with 286 being the first bit output (closest to TDO) and 1 being 
the last (closest to TDI). 

Scan 


Position* 

Latch Type 

Signal Name 

1 

Input 

DMTS 

2 

Output 

DMWR 

3 

Input 

DMACK 

4 

Output 

DMRD 

5 

Clock** 

CLKIN 

6 

Output Controltt 

DMRD /DMWR Output Enable 

7 

Input 

RESET 

8 

Output Controltt 

PMRD/PMWR Output Enable 

9 

Output 

PMRD 

10 

Input 

PMACK 

11 

Output 

PMWR 

12 

Input 

PMTS 

13 

Output Controltt 

PMD Output Enable 

14 

Input 

PMD47 Input Latch 

15 

Output 

PMD47 Output Latch 

16 

Input 

PMD46 Input Latch 

17 

Output 

PMD46 Output Latch 

18 

Input 

PMD45 Input Latch 

19 

Output 

PMD45 Output Latch 

20 

Input 

PMD44 Input Latch 

21 

Output 

PMD44 Output Latch 

22 

Input 

PMD43 Input Latch 

23 

Output 

PMD43 Output Latch 

24 

Input 

PMD42 Input Latch 

25 

Output 

PMD42 Output Latch 

26 

Input 

PMD41 Input Latch 

27 

Output 

PMD41 Output Latch 

28 

Input 

PMD40 Input Latch 

29 

Output 

PMD40 Output Latch 

30 

Input 

PMD39 Input Latch 

31 

Output 

PMD39 Output Latch 

32 

Input 

PMD38 Input Latch 

33 

Output 

PMD38 Output Latch 

34 

Input 

PMD37 Input Latch 

35 

Output 

PMD37 Output Latch 
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Scan 


Position* 

Latch Type 

Signal Name 

36 

Input 

PMD36 Input Latch 

37 

Output 

PMD36 Output Latch 

38 

Input 

PMD35 Input Latch 

39 

Output 

PMD35 Output Latch 

40 

Input 

PMD34 Input Latch 

41 

Output 

PMD34 Output Latch 

42 

Input 

PMD33 Input Latch 

43 

Output 

PMD33 Output Latch 

44 

Input 

PMD32 Input Latch 

45 

Output 

PMD32 Output Latch 

46 

Input 

PMD31 Input Latch 

47 

Output 

PMD31 Output Latch 

48 

Input 

PMD30 Input Latch 

49 

Output 

PMD30 Output Latch 

50 

Input 

PMD29 Input Latch 

51 

Output 

PMD29 Output Latch 

52 

Input 

PMD28 Input Latch 

53 

Output 

PMD28 Output Latch 

54 

Input 

PMD27 Input Latch 

55 

Output 

PMD27 Output Latch 

56 

Input 

PMD26 Input Latch 

57 

Output 

PMD26 Output Latch 

58 

Input 

PMD25 Input Latch 

59 

Output 

PMD25 Output Latch 

60 

Input 

PMD24 Input Latch 

61 

Output 

PMD24 Output Latch 

62 

Input 

PMD23 Input Latch 

63 

Output 

PMD23 Output Latch 

64 

Input 

PMD22 Input Latch 

65 

Output 

PMD22 Output Latch 

66 

Input 

PMD21 Input Latch 

67 

Output 

PMD21 Output Latch 

68 

Input 

PMD20 Input Latch 

69 

Output 

PMD20 Output Latch 

70 

Input 

PMD19 Input Latch 

71 

Output 

PMD19 Output Latch 

72 

Input 

PMD18 Input Latch 

73 

Output 

PMD18 Output Latch 

74 

Input 

PMD17 Input Latch 

75 

Output 

PMD17 Output Latch 

76 

Input 

PMD16 Input Latch 
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Scan 

Position* Latch Type 

77 Output 

78 Input 

79 Output 

80 Input 

81 Output 

82 Input 

83 Output 

84 Input 

85 Output 

86 Input 

87 Output 

88 Input 

89 Output 

90 Input 

91 Output 

92 Input 

93 Output 

94 Input 

95 Output 

96 Input 

97 Output 

98 Input 

99 Output 

100 Input 

101 Output 

102 Input 

103 Output 

104 Input 

105 Output 

106 Input 

107 Output 

108 Input 

109 Output 

110 Output 

111 Input 

112 Output 

113 Input 

114 Output 

115 Input 

116 Output 

117 Input 


Signal Name 
PMD16 Output Latch 
PMD15 Input Latch 
PMD15 Output Latch 
PMD14 Input Latch 
PMD14 Output Latch 
PMD13 Input Latch 
PMD13 Output Latch 
PMD12 Input Latch 
PMD12 Output Latch 
PMD11 Input Latch 
PMD11 Output Latch 
PMD10 Input Latch 
PMD10 Output Latch 
PMD9 Input Latch 
PMD9 Output Latch 
PMD8 Input Latch 
PMD8 Output Latch 
PMD7 Input Latch 
PMD7 Output Latch 
PMD6 Input Latch 
PMD6 Output Latch 
PMD5 Input Latch 
PMD5 Output Latch 
PMD4 Input Latch 
PMD4 Output Latch 
PMD3 Input Latch 
PMD3 Output Latch 
PMD2 Input Latch 
PMD2 Output Latch 
PMD1 Input Latch 
PMD1 Output Latch 
PMD0 Input Latch 
PMD0 Output Latch 
DMD Output Enable 
DMD0 Input Latch 
DMD0 Output Latch 
DMD1 Input Latch 
DMD1 Output Latch 
DMD2 Input Latch 
DMD2 Output Latch 
DMD3 Input Latch 


Controltt 
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Scan 



Position* 

Latch Type 

Signal Name 

118 

Output 

DMD3 Output Latch 

119 

Input 

DMD4 Input Latch 

120 

Output 

DMD4 Output Latch 

121 

Input 

DMD5 Input Latch 

122 

Output 

DMD5 Output Latch 

123 

Input 

DMD6 Input Latch 

124 

Output 

DMD6 Output Latch 

125 

Input 

DMD7 Input Latch 

126 

Output 

DMD7 Output Latch 

127 

Input 

DMD8 Input Latch 

128 

Output 

DMD8 Output Latch 

129 

Input 

DMD9 Input Latch 

130 

Output 

DMD9 Output Latch 

131 

Input 

DMD10 Input Latch 

132 

Output 

DMD10 Output Latch 

133 

Input 

DMD11 Input Latch 

134 

Output 

DMD11 Output Latch 

135 

Input 

DMD12 Input Latch 

136 

Output 

DMD12 Output Latch 

137 

Input 

DMD13 Input Latch 

138 

Output 

DMD13 Output Latch 

139 

Input 

DMD14 Input Latch 

140 

Output 

DMD14 Output Latch 

141 

Input 

DMD15 Input Latch 

142 

Output 

DMD15 Output Latch 

143 

Input 

DMD16 Input Latch 

144 

Output 

DMD16 Output Latch 

145 

Input 

DMD17 Input Latch 

146 

Output 

DMD17 Output Latch 

147 

Input 

DMD18 Input Latch 

148 

Output 

DMD18 Output Latch 

149 

Input 

DMD19 Input Latch 

150 

Output 

DMD19 Output Latch 

151 

Input 

DMD20 Input Latch 

152 

Output 

DMD20 Output Latch 

153 

Input 

DMD21 Input Latch 

154 

Output 

DMD21 Output Latch 

155 

Input 

DMD22 Input Latch 

156 

Output 

DMD22 Output Latch 

157 

Input 

DMD23 Input Latch 

158 

Output 

DMD23 Output Latch 


C - 7 


Scan 


Position* 

Latch Type 

159 

Input 

160 

Output 

161 

Input 

162 

Output 

163 

Input 

164 

Output 

165 

Input 

166 

Output 

167 

Input 

168 

Output 

169 

Input 

170 

Output 

171 

Input 

172 

Output 

173 

Input 

174 

Output 

175 

Input 

176 

Output 

177 

Input 

178 

Output 

179 

Input 

180 

Output 

181 

Input 

182 

Output 

183 

Input 

184 

Output 

185 

Input 

186 

Output 

187 

Input 

188 

Output 

189 

Input 

190 

Output 

191 

Output 

192 

Output 

193 

Output 

194 

Output 

195 

Output 

196 

Output 

197 

Input 

198 

Output 

199 

Output Con 


Signal Name 
DMD24 Input Latch 
DMD24 Output Latch 
DMD25 Input Latch 
DMD25 Output Latch 
DMD26 Input Latch 
DMD26 Output Latch 
DMD27 Input Latch 
DMD27 Output Latch 
DMD28 Input Latch 
DMD28 Output Latch 
DMD29 Input Latch 
DMD29 Output Latch 
DMD30 Input Latch 
DMD30 Output Latch 
DMD31 Input Latch 
DMD31 Output Latch 
DMD32 Input Latch 
DMD32 Output Latch 
DMD33 Input Latch 
DMD33 Output Latch 
DMD34 Input Latch 
DMD34 Output Latch 
DMD35 Input Latch 
DMD35 Output Latch 
DMD36 Input Latch 
DMD36 Output Latch 
DMD37 Input Latch 
DMD37 Output Latch 
DMD38 Input Latch 
DMD38 Output Latch 
DMD39 Input Latch 
DMD39 Output Latch 
DOMS+ 

DMS3 Output Latch 
DMS2 Output Latch 
DMS1 Output Latch 
DMSO Output Latch 
BG 
BR 

DMPAGE 

DMA Output Enable 
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Scan 

Position* 

Latch Type 

Signal Name 

200 

Output 

DMA31 

201 

Output 

DMA30 

202 

Output 

DMA29 

203 

Output 

DMA28 

204 

Output 

DMA27 

205 

Output 

DMA26 

206 

Output 

DMA25 

207 

Output 

DMA24 

208 

Output 

DMA23 

209 

Output 

DMA22 

210 

Output 

DMA21 

211 

Output 

DMA20 

212 

Output 

DMA19 

213 

Output 

DMA18 

214 

Output 

DMA17 

215 

Output 

DMA16 

216 

Output 

DMA15 

217 

Output 

DMA14 

218 

Output 

DMA13 

219 

Output 

DMA12 

220 

Output 

DMA11 

221 

Output 

DMA 10 

222 

Output 

DMA9 

223 

Output 

DMA8 

224 

Output 

DMA7 

225 

Output 

DMA6 

226 

Output 

DMA5 

227 

Output 

DMA4 

228 

Output Controltt 

FLAG3 Output Enable 

229 

Output 

DMA3 

230 

Output 

DMA2 

231 

Output 

DMA1 

232 

Output 

DMA0 

233 

Output Controltt 

FLAG2 Output Enable 

234 

Input 

FLAG3 Input Latch 

235 

Output 

FLAG3 Output Latch 

236 

Input 

FLAG2 Input Latch 

237 

Output 

FLAG2 Output Latch 

238 

Input 

FLAG1 Input Latch 

239 

Output 

FLAG1 Output Latch 

240 

Input 

FLAG0 Input Latch 



Scan 

Position* 

Latch Type 

Signal Name 

241 

Output 

FLAGO Output Latch 

242 

Output Controltt 

FLAG1 Output Enable 

243 

Input 

IRQO 

244 

Input 

IRQl 

245 

Input 

IRQ2 

246 

Input 

IRQ3 

247 

Output 

IASt 

248 

Output Controltt 

FLAGO Output Enable 

249 

Output 

CAVt 

250 

Output 

CA4t 

251 

Output 

CA3t 

252 

Output 

CA2t 

253 

Output 

CAlt 

254 

Output 

CAOt 

255 

Output 

EXABt 

256 

Output 

TIMEXP 

257 

Output 

PMAO 

258 

Output 

PMA1 

259 

Output 

PMA2 

260 

Output 

PMA3 

261 

Output 

PMA4 

262 

Output 

PMA5 

263 

Output 

PMA6 

264 

Output 

PMA7 

265 

Output 

PMA8 

266 

Output 

PMA9 

267 

Output 

PMA10 

268 

Output 

PMA11 

269 

Output 

PMA12 

270 

Output 

PMA13 

271 

Output 

PMA14 

272 

Output 

PMA15 

273 

Output 

PMA16 

274 

Output 

PMA17 

275 

Output 

PMA18 

276 

Output 

PMA19 

277 

Output 

PMA20 

278 

Output 

PMA21 

279 

Output 

PMA22 

280 

Output 

PMA23 

281 

Output Controltt 

PMA Output Enable 

282 

Output 

PMS1 

283 

Output 

PMSO 



Scan 

Position* Latch Type 

284 Output 

285 Output 

286 Output 


Signal Name 
PMPAGE 
POMSOt 
POMSlf 


* Position 1 = closest to TDI (scan in last); position 286 = closest to TDO (scan in first) 
** CLKIN can be sampled but not controlled (read-only). CLKIN continues to clock the 
ADSP-21020/21010 no matter which instruction is enabled, 
t Signals reserved for emulator use. Can be set to any state during scan. 

++ 1 = Drive the associated signals during the EXTEST and INTEST instructions 
0 = Tristate the associated signals during the EXTEST and INTEST instructions 


C.5 DEVICE IDENTIFICATION REGISTER 

No device identification register is included in the ADSP-21020/21010. 


C.6 BUILT-IN SELF-TEST OPERATION (BIST) 

No self-test functions are supported by the ADSP-21020/21010. 


C.7 PRIVATE INSTRUCTIONS 

Loading a value of 01 xx into the instruction register enables the private 
instructions reserved for emulation. The ADSP-21020/21010 EZ-ICE 
emulator uses the TAP and boundary scan as a way to access the 
processor in the target system. Use of the EZ-ICE emulator requires a 
target board connector for access to the TAP. See "EZ-ICE Emulator 
Considerations" in Chapter 9 for information on this connector. 
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Architectures. IEEE Computer Society Press. 
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Numeric Formats 


D.l OVERVIEW 

The ADSP-21020 and ADSP-21010 support the 32-bit single-precision 
floating-point data format defined in the IEEE Standard 754/854. In 
addition, the ADSP-21020 supports an extended-precision version of the 
same format with eight additional bits in the mantissa (40 bits total). Both 
the ADSP-21020 and ADSP-21010 also support 32-bit fixed-point 
formats — fractional and integer — which can be signed (twos-complement) 
or unsigned. 


D.2 IEEE SINGLE-PRECISION FLOATING-POINT DATA FORMAT 

IEEE Standard 754/854 specifies a 32-bit single-precision floating-point 
format, shown in Figure D.l. A number in this format consists of a sign 
bit s, a 24-bit significand, and an 8-bit unsigned-magnitude exponent e. 
For normalized numbers, the significand consists of a 23-bit fraction / and 
a "hidden" bit of 1 that is implicitly presumed to precede {22 in the 
significand. The binary point is presumed to lie between this hidden bit 
and f22- The least significant bit (LSB) of the fraction is fq; the LSB of the 
exponent is eg. The hidden bit effectively increases the precision of the 
floating-point significand to 24 bits from the 23 bits actually stored in the 
data format. It also insures that the significand of any number in the IEEE 
normalized-number format is always greater than or equal to 1 and less 
than 2. 



Hidden Bit Binary Point 


Figure D.l IEEE 32-Bit Single-Precision Floating-Point Format 

The unsigned exponent e can range between 1 < e < 254 for normal 
numbers in the single-precision format. This exponent is biased by +127 
(254 + 2). To calculate the true unbiased exponent, 127 must be subtracted 
from e . 





The IEEE Standard also provides for several special data types in the 

single-precision floating-point format: 

• An exponent value of 255 (all ones) with a nonzero fraction is a Not-A- 
Number (NAN). NANs are usually used as flags for data flow control, 
for the values of uninitialized variables, and for the results of invalid 
operations such as 

• Infinity is represented as an exponent of 255 and a zero fraction. Note 
that because the fraction is signed, both positive and negative Infinity 
can be represented. 

• Zero is represented by a zero exponent and a zero fraction. As with 
Infinity, both positive Zero and negative Zero can be represented. 


The IEEE single-precision floating-point data types supported by the 
ADSP-21020 and ADSP-21010 and their interpretations are summarized in 
Table D.l. 


Type 

Exponent 

NAN 

255 

Infinity 

255 

Normal 

1 < e < 254 

Zero 

0 


Fraction Value 

Nonzero Undefined 

0 (-l) s Infinity 

Any (-l) s (l-f 22 -o) 2 e “ 127 

0 (-l) s Zero 


Table D.l IEEE Single-Precision Floating-Point Data Types 


D.3 EXTENDED FLOATING-POINT FORMAT 

The extended precision floating-point format is 40 bits wide, with the 
same 8-bit exponent as in the standard format but a 32-bit significand. 
This format is shown in Figure D.2. In all other respects, the extended 
floating-point format is the same as the IEEE standard format. 



Hidden Bit Binary Point 


Figure D.2 40-Bit Extended-Precision Floating-Point Format 




D.4 FIXED-POINT FORMATS 

The ADSP-21020 andADSP-21010 support two 32-bit fixed-point formats: 
fractional and integer. In both formats, numbers can be signed (twos- 
complement) or unsigned. The four possible combinations are shown in 
Figure D.3. In the fractional format, there is an implied binary point to the 
right of the most significant magnitude bit. In integer format, the binary 
point is understood to be to the left of the LSB. Note that the sign bit is 
negatively weighted in a twos-complement format. 


Bit 

31 

30 

29 


2 

1 

0 


31 

30 

29 


2 

1 

0 

Weight 

-2 

2 

2 


2 

2 

2 


Sign 

Bit 


Signed Integer 


Bit 

31 

30 

29 


2 

1 

0 

Weight 

0 

-2 

-1 

2 

-2 

2 


-29 

2 

-30 

2 

-31 

2 


Sign 

Bit 


Signed Fractional 


Bit 

31 

30 

29 


2 

1 

0 


31 

30 

29 


2 

1 

0 

Weight 

2 

2 

2 


2 

2 

2 


Unsigned Integer 


Bit 

31 

30 

29 


2 

1 

0 

Weight 

-1 

2 

-2 

2 

—3 

2 


-30 

2 

-31 

2 

-32 

2 


Unsigned Fractional 

Figure D.3 32-Bit Fixed-Point Formats 






D Numeric Formats 


ALU outputs always have the same width and data format as the inputs. 
The multiplier, however, produces a 64-bit product from two 32-bit 
inputs. If both operands are unsigned integers, the result is a 64-bit 
unsigned integer. If both operands are unsigned fractions, the result is a 
64-bit unsigned fraction. These formats are shown in Figure D.4. 


Bit 

63 

62 

61 


2 

1 

0 


63 

62 

61 


2 

1 

0 

Weight 

2 

2 

2 

• • • 

2 

2 

2 


Unsigned integer 


Bit 

63 

62 

61 
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1 

0 

Weight 

-1 

2 

-2 

2 

-3 

2 

• • • 

-62 

2 

-63 

2 

-64 

2 


Unsigned Fractional 

Figure D.4 64-Bit Unsigned Fixed-Point Product 

If one operand is signed and the other unsigned, the result is signed. If 
both inputs are signed, the result is signed and automatically shifted left 
one bit. The LSB becomes zero and bit 62 moves into the sign bit position. 
Normally bit 63 and bit 62 are identical when both operands are signed. 
(The only exception is full-scale negative multiplied by itself.) Thus, the 
left shift normally removes a redundant sign bit, increasing the precision 
of the most significant product. Also, if the data format is fractional, a 
single-bit left shift renormalizes the MSP to a fractional format. The signed 
formats with and without left shifting are shown in Figure D.5. 

The multiplier has an 80-bit accumulator to allow the accumulation of 64- 
bit products. The multiplier and accumulator are described in detail in 
Chapter 2. 
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Bit 

63 

62 

61 


2 

1 

0 


63 

62 

61 


2 

1 

0 

Weight 

-2 

2 

2 

• • • 

2 

2 

2 


Sign 








Bit 



Signed Integer, No Left Shift 




Bit 

63 

62 

61 


2 

1 

0 


62 

61 

60 


i 

0 

-1 

Weight 

-2 

2 

2 


2 

2 

2 


Sign 






i 

t 


Bit 



Signed Integer With Left Shift 



T 

0 


Bit 

63 

62 

61 


2 

1 

0 

Weight 

0 

-2 

-1 

2 

-2 

2 

• • • 

-61 

2 

-62 

2 

-63 

2 


Sign 

Bit 



Signed Fractional, No Left Shift 




Bit 

63 

62 

61 


2 

1 

0 

Weight 

0 

-2 

-2 

2 

-3 

2 

• • • 

-62 

2 

-63 

2 

-64 

2 


Sign 

Bit 


Signed Fractional With Left Shift 0 

Figure D.5 64-Bit Signed Fixed-Point Product 
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Control/Status Registers H E 


E.l OVERVIEW 

This appendix describes the subset of universal registers known as system 
registers and the operations that can be performed on them. It also 
summarizes the bit definitions of the system registers that contain control 
or status information. For convenience, two other control registers, 

PMWAIT and DMWAIT, are also summarized here, but these registers are 
not system registers. 

The bit names that appear with each definition are used by convention 
only; they are not part of the instruction set. 

E.2 SYSTEM REGISTERS 

System registers are the universal registers listed in Table E.l. The system 
registers are a subset of the universal register set. They can be written 
from an immediate field in an instruction or they can be loaded from or 
stored to data memory. They can also be transferred to or from any other 
universal register in one cycle. 

Register Function Value After Reset 

MODE1 mode control 1 (see E.3) 0x0000 (cleared) 

MODE2 mode control 2 (see E.4) OxnOOO 0000 (bits 28-31 are the device identification 

field, identifying the silicon revision #) 
IRPTL interrupt latch (see E.7) 0x0000 (cleared) 

IMASK interrupt mask (see E.7) 0x0003 

IMASKP interrupt mask pointer 0x0000 (cleared) 

AST AT arithmetic status (see E.5) 0x00 nn 0000 (bits 19-22 are equal to the values of the 

FLAG0-3 input pins; the flag pins are 
configured as inputs after reset) 

STKY sticky status (see E.6) 0x0540 0000 

USTAT1 user status 1 0x0000 (cleared) 

USTAT2 user status 2 0x0000 (cleared) 

Table E.l System Registers 
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A write to any system register except USTAT1 or USTAT2 has one cycle of 
latency before any changes are effective. No wait states are inserted. If a 
write to a system register is immediately followed by a read, the value 
read is the new one, except for IMASKP which requires a extra cycle 
before the value is updated. 

E.2.1 System Register Bit Operations 

The system registers differ from other universal registers in that 
individual groups of bits can be set, cleared, XORed, toggled or tested 
using an immediate field in the bit manipulation instruction to specify the 
affected bits. See the instruction description in Appendix A for specifics. 

Although the shifter and ALU have bit manipulation capabilities, these 
computations operate on register file locations only. System register bit 
manipulation instructions eliminate the overhead of transferring system 
registers to and from the register file. 


Bit Instruction 
(System Registers) 

BIT SET register data 
BIT CLR register data 
BITTGL register data 
BITTST register data 
(result in BTF flag) 

E.2AA Bit Test Flag 

The test and XOR operations of the bit manipulation instruction store the 
result in the bit test flag (BTF, bit 18 in the ASTAT register). The state of 
BTF is a condition that you can specify in conditional instructions. The test 
operation sets BTF if all specified bits in the system register are set. The 
XOR operation sets BTF if all bits in the system register match the 
specified bit pattern. 

E.2.2 User Registers 

Two undedicated 32-bit status registers, USTAT1 and USTAT2, are user- 
defined. Bits in these registers can be set and tested using system register 
instructions. You can use these registers for low-overhead, general- 
purpose software flags or for temporary storage of data. 


Shifter Operation 
(Register File Locations) 

Rn = BSET Rx BY Ry I data 
Rn = BCLR Rx BY Ry I data 
Rn = BTGL Rx BY Ry I data 
BTST Rx BY Ry I data 
(result in SZ status flag) 



E.3 M0DE1 REGISTER 


Bit 

Name 

Definition 

0 


Reserved 

1 

BRO 

Bit-reverse for 10 (uses DMS0 only) 

2 

SRCU 

Alternate register select for computation units 

3 

SRD1H 

DAG1 alternate register select (7-4) 

4 

SRD1L 

DAG1 alternate register select (3-0) 

5 

SRD2H 

DAG2 alternate register select (15-12) 

6 

SRD2L 

DAG2 alternate register select (11-8) 

7 

SRRFH 

Register file alternate select for R(1 5-8) 

8-9 


Reserved 

10 

SRRFL 

Register file alternate select for R(7-0) 

11 

NESTM 

Interrupt nesting enable 

12 

IRPTEN 

Global interrupt enable 

13 

ALUSAT 

Enable ALU saturation (full scale in fixed-point) 

14 


Reserved 

15 

TRUNC 

l=Floating-point truncation; 0=Round to nearest 

16 

RND32 

l=Round floating-point data to 32 bits; 0=Round to 40 bits 
(must be set to 1 for ADSP-21010) 

17-31 


Reserved 


31 

30 

29 

28 27 

26 

25 

24 23 

22 

21 

20 19 

18 

17 

16 

IS 

0 

0 

IE 

0 

0 

0 | 0 

o 

□ 

0 | 0 

jj 

•0 

□ 


M0DE1 


■ — RND32 0=Round Floating-Point to 40 Bits 
1=Round Floating-Point to 32 Bits 
(must be set to 1 tor ADSP-21 010) 


15 14 13 12 11 10 9 


j o 0 0 


ill 


0=Fioating-Point Round-to-Nearest TRUNC 1 

1=Floating-Point Truncation 

0=Disable ALU Saturation ALUSAT 

1=Enable ALU Saturation 

0=Disable Interrupts IRPTEN 

1=Enable Interrupts 

0=Disable Interrupt Nesting NESTM 

1=Enable Interrupt Nesting 

0=Enable R7-R0 Primary SRRFL 

1=Enable R7-R0 Alternate 


' BRO 0=Dlsable 10 Bit-Reverse Mode 

1=Enabie 10 Bit-Reverse Mode 

SRCU 0=Enable MR Primary 

1=Enable MR Alternate 

SRD1H 0=Enable DAG1 7-4 Primary 

1=Enable DAG1 7-4 Alternate 

SRD1L 0=Enable DAG1 3-0 Primary 

1=Enable DAG1 3-0 Alternate 

SRD2H 0=Enable DAG2 15-12 Primary 

1=Enable DAG2 15-12 Alternate 


0=Enable R15-R08 Primary SRRFH 
1=Enable R15-R8 Alternate 


SRD2L 0=Enable DAG2 11-8 Primary 
1=Enable DAG2 11-8 Alternate 



I- 

mm 


U I 


%# 


E.4 M0DE2 REGISTER 


Bit Name 

0 IRQOE 

1 IRQ1E 

2 IRQ2E 

3 IRQ3E 

4 CADIS 

5 TIMEN 
6-14 

15 FLGOO 

16 FLGIO 

17 FLG20 

18 FLG30 

19 CAFRZ 
20-27 

28-31 


Defin ition 

IRQO l=edge sensitive; 0=level-sensitive 
IRQ1 l=edge sensitive; 0=level-sensitive 
IRQ2 l=edge sensitive; 0=level-sensitive 
IRQ3 l=edge sensitive; 0=level-sensitive 
Cache disable 
Timer enable 
Reserved 

FLAGO l=output; 0=input 
FLAG1 l=output; 0=input 
FLAG2 l=output; 0=input 
FLAG3 l=output; 0=input 
Cache freeze 
Reserved 

Device Identification Field (silicon revision #) 


MODE2 


31 

30 

29 

28 27 

26 

25 

24 23 

22 

21 

20 19 

18 

17 

16 

□ 

□ 

0 

IB 

0 

| o 

IB 

M 

in 

* 

0 

0 

□ 


Device Identification Field 
(silicon revision #) 


0=Cache Updates CAFRZ 

1=Cache Freeze (No Updates) 

0=FLAG3 Input FLG30 

1=FLAG3 Output 


FLGIO 0=FLAG1 Input 
1=FLAG1 Output 

FLG20 0=FLAG2 Input 
1=FLAG2 Output 


15 

14 

13 

12 

11 

10 

9 

8 7 

6 

5 

4 

3 

2 

1 

0 

E 

0 

3 

0 

0 

III 

iiSl 

0 

IB 

0 

0 

0 

0 

0 

0 

0 
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us Registers E 


E.5 ARITHMETIC STATUS REGISTER (ASTAT) 


Bit Name 

0 AZ 

1 AV 

2 AN 

3 AC 

4 AS 

5 AI 

6 MN 

7 MV 

8 MU 

9 MI 

10 AF 

11 SV 

12 SZ 

13 SS 
14-17 

18 BTF 

19 FLGO 

20 FLG1 

21 FLG2 

22 FLG3 

23 

24-31 


Definition 

ALU result zero or floating-point underflow 
ALU overflow 
ALU result negative 
ALU fixed-point carry 

ALU X input sign (ABS and MANT operations) 

ALU floating-point invalid operation 

Multiplier result negative 

Multiplier overflow 

Multiplier floating-point underflow 

Multiplier floating-point invalid operation 

ALU floating-point operation 

Shifter overflow 

Shifter result zero 

Shifter input sign 

Reserved 

Bit test flag for system registers 

FLAGO value 

FLAG1 value 

FLAG2 value 

FLAG3 value 

Reserved 

CACC (Compare Accumulation) bits 


31 

30 

29 

28 

27 

26 

25 

24 

23 

22 

21 

20 19 

18 

17 

16 

0 

0 

0 

0 

E 

0 

0 

0 

0 

E 

E 

| x | X 

0 

0 

■ 



CACC 

Compare Accumulation Shift Register 


L 


BTF Bit Test Flag for System Registers 


ASTAT 


FLAG3 Value FLG3 1 

FLAG2 Value FLG2 


FLGO FLAGO Value 
FLG1 FLAG1 Value 


15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 

0 I 0.1 0 I 0 I 0 I 0 I 0 I 0 I 0 I 0 I 0 I 0 I 0 I 0 I 0 I 0 


Shifter Input Sign SS 

Shifter Zero SZ 

Shifter Overflow SV 

ALU Floating-Point Operation AF 

Multiplier Floating-Point Ml 
Invalid Operation 

Multiplier Floating-Point Underflow MU 




Multiplier Overflow MV 



AZ 

AV 

AN 

AC 

AS 

AI 

MN 


ALU Zero/Floating-Point Underflow 

ALU Overflow 

ALU Negative 

ALU Fixed-Point Carry 

ALU X Input Sign (for ABS and MANT) 

ALU Floating-Point Invalid Operation 

Multiplier Negative 
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E Controi/Status Registers 


E.6 

STICKY ARITHMETIC STATUS REGISTER (STKY) 

Bit 

Name 

Definition 

0 

AUS 

ALU floating-point underflow 

1 

AVS 

ALU floating-point overflow 

2 

3-4 

AOS 

ALU fixed-point overflow 

Reserved 

5 

AIS 

ALU floating-point invalid operation 

6 

MOS 

Multiplier fixed-point overflow 

7 

MVS 

Multiplier floating-point overflow 

8 

MUS 

Multiplier floating-point underflow 

9 

10-16 

MIS 

Multiplier floating-point invalid operation 

Reserved 

17 

CB7S 

Pl A f' , 1 riroiil^r* Uitffrti* 7 /^t> 

lynOx cxicumx l/uxxu / v/vcuu/vv 

18 

19-20 

CB15S 

DAG2 circular buffer 15 overflow* 

Reserved 

21 

PCFL 

PC stack full (not sticky) 

22 

PCEM 

PC stack empty (not sticky) 

23 

SSOV 

Status stack overflow ( MODE1 and AST AT) 

24 

SSEM 

Status stack empty (not sticky) 

25 

LSOV 

Loop stack overflow (Loop Address and Loop Counter) 

26 

27-31 

LSEM 

Loop stack empty (not sticky) 

Reserved 


( Bits 21-26 are read-only. Writes to the STKY register have no effect on these bits.) 

* Bit 17 (DAG1 circular buffer 7 overflow) and Bit 18 (DAG2 circular buffer 15 
overflow) indicate the occurrence of a circular buffer overflow. Rather then 
remaining set until explicitly cleared, however, these bits are cleared by the next 
subsequent memory access that uses the corresponding I register (17, 115). 
Circular buffer interrupts, therefore, should be used instead of these STKY 
register bits. See Section 4.3.2.3, "Circular Buffer Overflow Interrupts," in 
Chapter 4. 
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STKY 




roi/Status 




31 

30 

29 

28 27 

26 

25 

24 23 

22 

21 

20 19 

18 

17 

16 

nn 

0 

o 

Hill 

0 | 6 

0 

0 

o 

o 

0 

0 

0 | B 

0 

0 

ll 


Loop Stack Empty (Read-Only) LSEM 1 

Loop Stack Overflow (Read-Only) LSOV 

Status Stack Empty (Read-Only) SSEM 

Status Stack Overflow (Read-Only) SSOV 


I CB7S DAG 1 Circular 

Buffer 7 Overflow 

CB15S DAG2 Circular 

Buffer 15 Overflow 

PCFL pc Stack Full (Read-Only) 

PCEM pc Stack Empty (Read-Only) 


15 

14 

13 

12 

11 

10 

9 

8 7 

6 

5 

4 3 

2 

1 

0 

ll 

0 

0 

$ 

0 

0 

0 

EE 

0 

0 

0 | 0 

0 

0 

□ 


Multiplier Floating-Point 
Invalid Operation 

Multiplier Floating-Point Underflow 

Multiplier Floating-Point Overflow 

Multiplier Fixed-Point Overflow 


MIS 


J 


MUS 

MVS 


MOS 


^AUS ALU Floating-Point Underflow 
— AVS ALU Floating-Point Overflow 
— AOS ALU Fixed-Point Overflow 
— AIS ALU Floating-Point Invalid Operation 
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E.7 INTERRUPT LATCH (IRPTL) & INTERRUPT MASK (IMASK) 


Bit 


(Int#) 

Address Name 

Function 

0 

0 x 00 


Reserved for emulation 

1 

0x08 

RSTI 

RESET 

2 

0 x 10 


Reserved 

3 

0x18 

SOVFI 

Status stack or loop stack overflow or PC stack full 

4 

0 x 20 

TMZHI 

Timer =0 (high priority option) 

5 

0x28 

IRQ3I 

IRQ 3 asserted 

6 

0x30 

IRQ2I 

IRQ 2 asserted 

7 

0x38 

IRQ1I 

IRQl asserted 

8 

0x40 

IROOT 

TROn asserted 

9 

0x48 


Reserved 

10 

0x50 


Reserved 

11 

0x58 

CB7I 

Circular buffer 7 overflow interrupt 

12 

0x60 

CB15I 

Circular buffer 15 overflow interrupt 

13 

0 x 68 


Reserved 

14 

0x70 

TMZLI 

Timer=0 (low priority option) 

15 

0x78 

FIXI 

Fixed-point overflow 

16 

0x80 

FLTOI 

Floating-point overflow exception 

17 

0 x 88 

FLTUI 

Floating-point underflow exception 

18 

0x90 

FLTII 

Floating-point invalid exception 

19-23 

0x98-0xB8 

Reserved 

24 

OxCO 

SFTOI 

User software interrupt 0 

25 

0xC8 

SFT1I 

User software interrupt 1 

26 

OxDO 

SFT2I 

User software interrupt 2 

27 

0xD8 

SFT3I 

User software interrupt 3 

28 

OxEO 

SFT4I 

User software interrupt 4 

29 

0xE8 

SFT5I 

User software interrupt 5 

30 

OxFO 

SFT 6 I 

User software interrupt 6 

31 

0xF8 

SFT7I 

User software interrupt 7 

For IMASK: 

l=unmasked (enabled), 0=masked (disabled) 

(interrupts 0 and 1 are not maskable) 




IRPTL & IMASK 






Default values for IMASK only; IRPTL is cleared after reset. 
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E.8 PROGRAM MEMORY INTERFACE CONTROL (PMWAIT) 

Bit Definition 

13 1= Automatic wait state for access across page boundary 

0=No automatic wait state 

12-10 Memory page size : 


000 

256 words 

001 

512 words 

010 

1024 words 

Oil 

2048 words 

100 

4096 words 

101 

8192 words 

110 

16384 words 

111 

32768 words 


9-7 Number of program memory bank 1 wait states (0-7) 
6-5 Wait state mode* for program memory bank 1 
4-2 Number of program memory bank 0 wait states (0-7) 
1-0 Wait state mode* for program memory bank 0 

* Wait state mode bits : 

0 0 External acknowledge only 

0 1 Internal software wait states only 

1 0 Both Internal and External acknowledge 
1 1 Either Internal or External acknowledge 


31 

30 

29 

28 27 

26 

25 

24 23 

22 

21 

20 19 

18 

17 

16 15 

14 

0 

0 : 

0 

rnr 

0 

0 

«T5 

0 

0 

ale 

0 1 

0 

TIT 

0 


13 12 11 10 9 8 


TZ 


DRAM Bank 1 

Program memory number of 
page sizef wait states 


Bank 1 
wait state 
mode* 


Automatic 
wait state 
on boundary 
crossing 


Bank 0 
number of 
wait states 


Bank 0 
wait state 
mode* 


t DRAM Memory Page Size Codes 


000 

256 Words 

001 

512 Words 

010 

1024 Words 

011 

2048 Words 

100 

4096 Words 

101 

8192 Words 

110 

16384 Words 

111 

32768 Words 


* Wait State Mode Codes 

00 

External acknowledge only 

01 

Internal wait states only 

10 

Both external and internal required 

11 

Either external or internal sufficient 
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E.9 DATA MEMORY INTERFACE CONTROL (DMWAIT) 

Bit Definition 

23 1= Automatic wait state for access across page boundary 

0=No automatic wait state 

22-20 Memory page size : 


000 

256 words 

001 

512 words 

010 

1024 words 

Oil 

2048 words 

100 

4096 words 

101 

8192 words 

110 

16384 words 

111 

32768 words 


19-17 Number of program memory bank 3 wait states (0-7) 
16-15 Wait state mode’*' for program memory bank 3 
14-12 Number of program memory bank 2 wait states (0-7) 
11-10 Wait state mode* for program memory bank 2 
9-7 Number of program memory bank 1 wait states (0-7) 
6-5 Wait state mode* for program memory bank 1 
4-2 Number of program memory bank 0 wait states (0-7) 
1-0 Wait state mode* for program memory bank 0 


* Wait state mode bits : 

0 0 External acknowledge only 

0 1 Internal software wait states only 

1 0 Both Internal and External acknowledge 
1 1 Either Internal or External acknowledge 


31 

30 

29 

28 27 

26 

25 

24 23 

22 

21 

20 19 

18 

17 

16 15 

• 

« 

0 

0 | 9 


0 

111 0 

0 

0 

Mil 

0 

□ 

Mil 



Automatic DRAM Data 

wait state memory 

on boundary page sizef 

crossing 


Bank 3 Bank 3 

number of wait state 

wait states mode* 


f DRAM Memory Page Size Codes 

000 

256 Words 

001 

512 Words 

010 

1024 Words 

011 

2048 Words 

100 

4096 Words 

101 

8192 Words 

110 

16384 Words 

111 

32768 Words 


14 

13 

12 11 

10 

9 

8 7 

6 

5 

4 

3 

2 

1 

0 

□ 

0 

Ml! 

0 

0 

Mil 

0 

0 

□ 

0 

0 

0 

0 


Bank 2 
number of 
wait states 


Bank 2 
wait state 
mode* 


Bank 1 
number of 
wait states 


Bank 1 
wait state 
mode* 


BankO 
number of 
wait states 


BankO 
wait state 
mode* 


* Wait State Mode Codes 

00 

External acknowledge only 

01 

Internal wait states only 

10 

Both external and internal required 

11 

Either external or internal sufficient 
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ADS P-2 1 02 0/2 1010 User's Manual 2nd Edition (1993) 
Revisions from 1st Edition (1991) 


Key 

Text deleted from 1st Edition is crossed-out: " only by EXP opera t ions " 
New text added in 2nd Edition is underlined: " SS is cleared " 


page (2nd Ed.) revision 

p. 1-7, para. 5 " Fixed-point and single-precision floating-point 

data is aligned to the upper 32 bits of the PMP 
bus." 


p. 1-7, para. 6 " Fixed-point and single-precision floating-point 

data is aligned to the upper 32 bits of the DMD 
bus ." 


p. 2-1, last para. " The individual registers of the register file are 
prefixed with an "f" when used in floating-point 
computations (in assembly language source code). 
The registers are prefixed with an "r" when used 
in fixed-point computations. The following 
instructions, for example, use the same three 
registers : 

F0=F1 * F2 ; floating-point multiply 
R0=R1_*_R2 ; fixed-point multiply 

The "f" and "r" prefixes do not affect the 40-bit 
data transfer; they only determine how the ALU, 
multiplier, or shifter treat the data ." 


p. 2-10 ALU Instruction Summary table now shows 

effects of ALU instructions on all status bits of 
AST AT and STKY status registers. 

p. 2-17, para. 6 Twos-complement-Fractional: "upper 33 17 bits of 
MR not all zeros or all ones" 

p. 2-17, para. 7 Unsigned-Fractional: "upper 32 16 bits of MR not 
all zeros" 


p. 2-18 Multiplier Instruction Summary table now shows 

effects of multiplier instructions on status bits of 
AST AT and STKY status registers. 



p. 2-20, para. 1 

p. 2-20, Figure 2.4 

p. 2-20 

p. 2-24, para. 1 

p. 2-24, last para. 

p. 2-25 


'The X-input and the Z-input are always 32-bit 
fixed-point values. The Y-input is a 32-bit fixed- 
point value or t wo 6-bi t fields, bi t 6 -a ncHen 6 an 8- 
bit field (shf8), positioned in the register file as 
shown in Figure 2.4. Bi t6 and len6 are in t erpre t ed 
as p o si t ive in t egers : Bi t6 holds a bi t position value: 
Len6 - holds a field leng t h value . 

Some shifter operations produce 6-bit or 8-bit 
results. These results are placed in either the bit6 
or shf8 field, showrHn Figure 2.4 , (see Figure 2.5) 
and are sign-extended to 32 bits. Thus the shifter 
always returns a 32-bit result." 

12-bit Y-Input (len6, bit6) representation is deleted 
from Figure 2.4 and is shown in (new) Figure 2.5 
instead 

New section added: "2.7.2 Bit Field Deposit & 
Extract Instructions" 

"The SZ flag indicates if the output is zero, the SV 
flag indicates an overflow, and the SS flag 
indicates the sign bit in exponent extract 
operations." 

"SS is affected only - by t he t wo EXP opera t ions by 
all shifter operations. For the two EXP (exponent 
extract) operations, it is set if the fixed-point input 
operand is negative and cleared if it is positive. 

For all other shifter operations, SS is cleared. " 

In Shifter Instruction Summary table, SS flag is 
now shown as being cleared by all instructions 
except EXP Rx and EXP Rx (EX). 
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p. 2-27, last para. 


" The individual registers of the register file are 
prefixed with an "f" when used in floating-point 
computations (in assembly language source code). 
The registers are prefixed with an "r" when used 
in fixed-point computations. The following 
instructions, for example, use the same three 
registers : 

F0=F1 * F2; floating-point multiply 
R0=R1 * R2 ; fixed-point multiply 

The "f" and "r" prefixes do not affect the 40-bit (or 
32-bit) data transfer; they only determine how the 
ALU, multiplier, or shifter treat the data /' 

p. 3-5, para. 2 " The system register bit manipulation instruction can 
be used to set, clear, toggle or test specific bits in 
these registers. This instruction is described in 
Appendix A, Group IV-Miscellaneous 
instructions /' 

p. 3-5 Table 3.1 now shows which registers are defined 

as System Registers. 

p. 3-7, para. 3 "The bit test flag (BTF) is bit 18 of the AST AT 

register. The s t a t e o f BTFis -o neof the condi t ions 
the AD S P - 21020 evalua t es. This read-only flag - is 
affec t edH b y t he - sy s t em regis t er bit test and XOR 
ins t ruc t ions. These opera t ions are desc r ibed a t t he 
end of t his chap t er . This flag is set (or cleared) by 
the results of the BIT TST and BIT XOR forms of 
the system register bit manipulation instruction, 
which can be used to test the contents of the 
ADSP-21020's system registers. This instruction is 
described in Appendix A, Group IV-Miscellaneous 
instructions. After BTF is set by this instruction, it 
can be used as the condition in a conditional 
instruction (with the mnemonic TF; see Table 
32)" 
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p. 3-8, Table 3.2 
p. 3-9, last para. 

p. 3-13, para. 3 

p. 3-13, last para. 

p. 3-15, Figure 3.7 

p. 3-17, para. 1 


LCE Loop Cntr Expired (loop term) CURLCNTR = -0- _1 

NOT LCE Loop Cntr Expired (condition) CURLCNTR ^ -0- _1 

' This is similar to the break instruction of the C 
programming language used to prematurely 
terminate execution of a loop/ ' 

" Here is a simple example of an ADSP-21020 loop : 

LCNTR=30. DO label UNTIL LCE; 

R0=DM ( 10 , MO ) f F2=PM (18. M8 ) ; 

Rl=R0-R15 ; 

label: F4= F2 +F3; " 


"If the termination condition is true, the sequencer 
fetches the next instruction after the end of the 
loop and pops the loop stack and PC stack." 

The One-Instruction Loop, Three Iterations table is 
modified in third, fourth, and fifth clock cycle 
columns. 

" For ne s ted loops - m which - t he ou t e r lo e p - 's 
termina t ion - condi t ion in- no t LCE, t he - end-address 
of t heou t er l o op - must b e at least two -l oea t ions 
af t er t he - end - add r ess of t he inner loop . 

A non-counter-based loop is one in which the loop 
termination condition is something other than 
LCE. When a non-counter-based loop is the outer 
loop of a series of nested loops, the end address of 
the outer loop must be located at least two 
addresses after the end address of the inner loop . 
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p. 3-17, para. 2 

p. 3-23 

p. 3-23, last para. 

p. 3-27 

p. 3-28, para. 5 


The TUMP (LA) instruction is used to prematurely 
abort execution of a loop. When this instruction is 
located in the inner loop of a series of nested loops 
and the outer loop is non-counter-based, the 
address jumped to cannot be the last instruction of 
the outer loop. The address jumped to may, 
however, be the next-to-last instruction (or any 
earlier) /' 

" Non-counter-based short loops terminate in a 
special way because of the fetch-decode-execute 
instruction pipeline/' 

" • waitstates for memory accesses 

• bus grant " 

"I RPTL is-in - an4n d eterminat e- s t ate - a t r eset: ' - You 
should clear IRPTL by wri t ing zeros f ont - be f ore 
enab l ing - inter r up t s - OT - unmas^mgtmy inter r up t . 

IRPTL is cleared by a processor reset /' 

'' If an RTI is specified as delayed, the two 
ins t ructions following t he RTI are execu t ed after 
the status stack has been popped bu t- before 
con t rol re t urns t o the main program. - Any sta t us 
bits - read - or w r i tt er r by - th - ose - t wcr t ns tr uc t i o ns 
reflect t he con t ex t of t he - main programy no t t he 
service - rou t ine /' 

"The STKY register maintains stack overflow/ full 
and underfl o w flags for t he PC s t ack, t he sta t us 
s t ack and the l oop s t acks full and stack empty 
flags for the PC stack as well as overflow and 
empty flags for the status stack and loop stack . 
Unlike other STKY bits, t he s t ack overflow/ full 
and underfl o w - fla g s several of these flag bits are 
not "sticky." They are set by the occurrence of the 
condition they indicate and are cleared when the 
condition is changed (by a push, pop or processor 
reset). 
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Bit 

Name 

Definition 

Sticky/Not Sticky 

Cleared By 

21 

PCFL 

PC stack full 

Not sticky 

Pop 

22 

PCEM 

PC stack empty 

Not sticky 

Push 

23 

SSOV 

Status stack overflow 

Sticky 

RESET 

24 

SSEM 

Status stack empty 

Not sticky 

Push 

25 

LSOV 

Loop stacks overflow* 

Sticky 

RESET 

26 

LSEM 

Loop stacks empty* 

Not sticky 

Push 


* Loop address stack and loop counter stack 

p. 3-30, para. 1 " On return from the interrupt, execution continues 
at the instruction after the IDLE instruction ." 


p. 4-4, para. 6 


p. 4-6, para. 3 


p. 4-8 


" The L register and modulo logic do not affect a 
pre-modified address — pre-modify addressing is 
always linear, not circular ." 

" Circular buffer addressing must use M registers 
for post-modify of I registers, not pre-modify; for 
example : 

F1=DM(I0,M0) ; Use post-modify addressing for 
circular buffers , 

F1=DM (MO . IQ) : not pre-modify. " 

New section added: "4.3.2.3 Circular Buffer 
Overflow Interrupts" 


p. 4-11, para. 2 " For certain instruction sequences involving 
transfers to and from DAG registers, an extra 
(NOP) cycle is either automatically inserted by the 
processor (1, 2) or must be inserted in code by the 
programmer (3). Certain other sequences cause 
incorrect results and are not allowed by the ADSP- 
21020 Assembler (4). " 
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p. 4-12, para. 1 


" (Note that because the DAG2 registers are used 
to fetch instructions or access data in every cycle, a 
write to a program memory control register will 
always require an extra cycle to be inserted.) 

Each of the following instruction sequences, for 
example, 

PMWAI T— 0x0 00 0 6- Or 
RT5-PPH H : 07 * ft £ t T 

PMWAIT=0x08Q000 : or DMBANKl=QxlOQQQQQQ : 
NOP; R15=DM(I0.M1) ; " 

p. 4-12, para. 3 " 3.) An instruction that writes any L or M register 

of DAG2 (L8-L15, M8-M15), immediately followed 
by an instruction that reads the corresponding I 
register will result in incorrect data being read 
from the I register. The following instruction 
sequence, for example, 

L8=24 ; 

R0-I 8 ; ,, 

will cause incorrect data to be read from 18. To 
prevent this, add a NOP to your program between 
the two instructions (i.e. the L or M register write 
and the I register read): 

L .8= 2 4 
NOP; 

R 0= _I 8 j_ " 

p. 6-2 "PMD 47 0 Program Memory Data. The ADSP- 

21020 inputs and outputs data and instructions on 
these pins. 32-bit fixed-point data and 32-bit 
single-precision floating-point data is transferred 
over bits 47-16 of the PMD bus. " 


" PM 6 e — Program Memory Selec t 0. This pin i s 
asser t ed - t o selec t bank 0 of program memory : 
Memory banks - are user-defined inmemo r y 
c o n tr ol - regis t ers . 
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PMS^ Program Memory Select ! . This pin is 
asser t ed t o selec t bank 1 ofprogram m e mo r y. 
Memory banks are user-de f ined - inmomo r y 
con t rol registers/ ' 

"PM 5^ Program Memory Select lines 1 & 0. 
These pins are asserted as chip selects for the 
corresponding banks of program memory. 
Memory banks must be defined in the processor's 
memory control registers. These pins are decoded 
pro gram memory address lines and provide an 
early indication of a possible bus cycle ." 

"PMACK Program Memory Acknowledge. An 
ex t ernal device^as s e rts-t hisADSP-21 020 inputto 
te rmina t e a memory a^ces sr- This is one - me t hod of 
c r eating wai t sta t es .-Thi s- inpu t is normally - ti e d 
high fo r - xero wai t -s t a te- op er a t ion . An external 
device deasserts this input to add wait states to a 
memory access. " 

"PMTS Program Memory Three-State Control. 
An ac t ive -s ignal on -t his mpu P phtc e s- p r o g r am 
memory address, data and control signals in a 
high-impedance s t ate and locks ou t t hePMAGK 
input, wi t hou t halting - t he processor . 

PMTS places the program memory address, data, 
selects, and strobes in a high-impedance state. If 
PMTS is asserted while a PM access is in progress, 
the processor will halt and the memory access will 
not be completed. PMACK must be asserted for at 
least one cycle when PMTS is deasserted to allow 
any pending memory access to complete properly . 
PMTS should only be asserted (low) during an 
active memory access cycle ." 

"DMD 39 0 Data Memory Data. The ADSP-21020 
inputs and outputs data on these pins. 32-bit 
fixed-point data and 32-bit single-precision 
floating-point data is transferred over bits 39-8 of 
the DMD bus. " 




"DMS 3 0 Data Memory Select lines 0, 1, 2, & 3. 
These pins are asserted as chip selects for the 
corresponding banks of data memory. Memory 
banks must be defined in the processor's memory 
control registers. These pins are decoded data 
memory address lines and provide an early 
indication of a possible bus cycle ." 


"DMACK Data Memory Acknowledge. An 




"DMTS Data Memory Three-State Control. An 


wmmmmmmmmmmmm 


laces the data memory address, data. 


selects, and strobes in a hieh-impedance state. If 


DMTS is asserted while a DM access is in 





p. 6-4, para. 3 


p. 6-5, para. 2 


p. 6-7, para. 1 

p. 6-7, para. 3 
p. 6-7, last para, 
p. 6-14 


access will not be completed. DMACK must be 
asserted for at least one cycle when DMTS is 
deasserted to allow any pending memory access to 
complete properly . DMTS should only be asserted 
(low) during an active memory access cycle /' 

"1. The ADSP-21020 drives the read address and 
asserts a memory select signal to indicate the 
selected bank. A memory select signal is not 
deasserted between successive accesses of the 
same memory bank. 

2. The ADSP-21020 asserts the read strobe (unless 
the memory access is aborted because of a 
conditional instruction) / 7 

"1. The ADSP-21020 drives the write address and 
asserts a memory select signal to indicate the 
selected bank. A memory select signal is not 
deasserted between successive accesses of the 
same memory bank. 

S t r The - A - D S P - 21020 - asser t s the write s t robe . 

3” P h e“ ADSP-21020 drives t heda t a . 

2. The ADSP-21020 asserts the write strobe and 
drives the data (unless the memory access is 
aborted because of a conditional instruction) / 7 

"... the acknowledge should be deasserted (low) in 
the same cycle after that the three-state enable is 
deasserted in." 

" BMAC - K " 

" PMAGK " 

Figure 6.4, "Bus Request/Bus Grant Timing," is 
revised to show possible multicycle instruction 
execution completion before buses are granted. 
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p. 6-16, last para. 


p. 7-4, para. 2 


p. 7-4, para. 3 


" To write a 48-bit word to a program memory 
location named Portl , for example, the following 
instructions would be used : 

RQ=Qx , 9AQQ; /* lo a d RO with 16 LSBs */ 

Rl=0xl2345678; /* load R1 with 32 MSBs */ 

PX1=RQ .; 

PX2=R1; 

PM (Portl) =PX; /* write 16 LSBs to PM bits 15-0 */ 

/* and 32 MSBs to PM bits 47-16 */ 

" An example is : 


L2=8 : 

DM ( 10 . Ml ) =R1 : " 

" (Note that because the DAG2 registers are used 
to fetch instructions or access data in every cycle, a 
write to a program memory control register will 
always require an extra cycle to be inserted.) 

Each of the following instruction sequences, for 
example, cause the ADSP-21020 to insert an extra 
cycle between the two instructions : 

PMWAIT=QxQ800Q0 : or DMBANKl=QxlQQQQ0Q0 ; 
NOP: R15— DM (IQ, Ml ) ; " 

An instruction that writes any L or M register of 
DAG2 (L8-L15, M8-M15), immediately followed 
by an instruction that reads the corresponding I 
register will result in incorrect data being read 
from the I register. The following instruction 
sequence, for example, will cause incorrect data to 
be read from 18 : 


L8=2 4 ; 
R 0= 1.8 ; 


R — 1 1 



To prevent this, add a NOP between the two 
instructions: 


L£=24j_ 

NOP: 

RO=m i_ " 

p. 7-14 Table 7-14 is simplified (but no changes made in 

instruction definitions). 

p. 7-19 Instruction type 10 deleted. 

p, 8-6 In Listing 8.1, several segment addresses are 

corrected as follows: 


.SEGMENT /RAM 
.SEGMENT /RAM 
.SEGMENT /RAM 


/BEGIN=0x000100 /END= 0 x QQ7FFF /PM pm_code; 

/BEGIN= 0 x 63 0 000 /END= 0xFFFFFF /PM pm_data; 

/BEGIN=0x00000000 /END= 0x€07FFFFF /DM dm_data; 


.SEGMENT /RAM 
.SEGMENT /RAM 
.SEGMENT /RAM 


/BEGIN=0x000100 /END= 0xQQQ7FF /PM pm_code; 
/BEGIN= Qx0QQ800 /END= QxQQQFFF /PM pm_data; 
/BEGIN=0x00000000 /END= 0x00007FF /DM dm_data, 


p. 8-20 


p. 8-27, para. 2 
p. 8-27, para. 4 
p. 8-28, para. 2 


The next-to-last instruction of the initial 
setups : portion of Listing 8.4 is deleted: 

bit set" irptl d; (RESET 'd oesn - ^t c lear 

t tr is 1 } 

bit s et Ir p tl -Or 
bit set 1 ir p tl O ; 1 
bit ' s e t irptl 0; 
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p. 8-34 


In Listing 8.8, the 


p. 9-4, para. 2 

p. 9-4, para. 4 
p. 9-4 

p. 9-5, Table 9.1 


p. 9-6, para. 1 

p. 9-6, Fig. 9.2 


portion of the second instruction is replaced by: 

r8=r8 xor r8 

Therefore, to ensure recognition of an 
asynchronous input, it must be asserted for at least 
one full processor cycle plus setup and hold time 
(except for RESET, which must be asserted for at 
least four processor cycles) . 

Table 9.2 shows the states of outputs after during 
reset (i.e. while RESET is low) . 

New section added: "9.4 RCOMP Pin" 


The following reset values are corrected: 


PC 

LCNTR 

Unchanged 

Unchanged 

0x0008 

0x0000 (cleared) 

IRPTL 

Unchanged 

0x0000 (cleared) 

IMASK 

e 

0x0003 

STKY 

e 

0x0540 0000 

MODE2 

e 

OxnOOO 0000 


(bits 28-31 are the device 


identification fielch identifvine the 

ASTAT 

silicon revision #) 

e 

0x00 nn 0000 


(bits 19-22 are equal to the values of 
the FLAGO-3 input pins; the flag pins 
are configured as inputs after reset) 


"During the first two memory accesses, which 
have eight seven wait states each due to the 
default value of the PM WAIT register, ..." 

One additional CLKIN cycle is added between 
rising edge of RESET and start of first instruction 
fetch (0x000008 driven onto PM A bus). 



p. 9-9, para. 2 "No pullup or pulldown resistors are needed on 
these unused pins — this is taken care of on-chip ." 


p. 9-9, Fig. 9.4 


p. 9-10, Fig. 9.5 


Data bus lines corrected: 

DMDO-31 

DMD39-8 

Faster ADSP-21020 device: 
Memories relabeled: 

ou ns 

7C196-35 

30 ns 

SRAM 

64Kx4 

15 ns 

Faster ADSP-21020 device: 
Memories relabeled: 

irn 

D\J ns 

7C199-35 

30 ns 

SRAM 

32Kx8 

1 E 


p. 9-11, para. 2 " I/O devices should be connected to the 32-bit 

integer field (the upper 32 bits) of the DMD or PMD 
data buses — bits 39-8 of the DMP bus, and bits 47- 
16 of the PMD bus." 


p. 9-11, Fig. 9.6 

Faster ADSP-21020 device: 

r r\ 

ou ns 

30 ns 


Memories relabeled: 

7099-35 

SRAM 




32Kx 8 




15 ns 

p. 9-13, Fig. 9.8 

Data bus lines corrected: 

DMD31-0 

DMD39-8 

p. 9-15, Fig. 9.10 

Data bus lines corrected: 

DMD31-0 

DMD39-8 

p. 9-16, Fig. 9.11 

Data bus lines corrected: 

DMD31-0 

DMD39-8 

p. 9-19, Fig. 9.13 

Data bus lines corrected: 

DMDO-39 

DMD39-8 

p. 9-18 (1st Ed) 

Section 9.5.2.2 "Shared Single-Port Memory" 


deleted. 
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p. 9-30 Section 9.8 "EZ-ICE EMULATOR 

CONSIDERATIONS" is revised. In Figure 9.20 
"Target Board Connector For EZ-ICE Probe" the 
signals BTRST and TRST are now shown as active 
low: BTRST TRST 

p. A-8 LCE Loop Cntr Expired (loop term) CURLCNTR = -9- _1 

NOT LCE Loop Cntr Expired (condition) CURLCNTR * -9- _1 

p. A-23 Instruction type 10 deleted. 

p. A-28,29 (1st Ed) Instruction type 10 deleted. 

p. A-30 "The end address can be either a label for an 

absolute 24-bit program memory address, or a PC- 
relative, 24-bit twos-complement address." 

p. A-32, Do Until "The end address can be either a label for an 

absolute 24-bit program memory address, or a PC- 
relative, 24-bit twos-complement address." 

"Examples: 

DO exgFO - O end UNTIL FLAG1_IN; 

(end is a program label \ 


DO (PC, errrd 7) UNTIL At- 
t end "is — user - 1 defined label -' ) 


p. A-46, Idle " On return from the interrupt, execution continues 

at the instruction following the IDLE instruction ." 

p. B-2, para.l " The CU (computation unit) field is defined as 

follows : 

CU-QQ ALU operations 

CU=01 Multiplier operations 

CU-10 Shifter operations " 
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p. B-39, para.2 


p. B-40, para.2 


"The following code performs floating-point 
division using an iterative convergence 
algorithm.* The result is accurate to one LSB in 
whichever format mode, 32-bit or 40-bit, is set (32- 
bit only for ADSP-21010) . This code execu t es in 8 
cycles: It requires these inputs: FO=numerator, 
F12=denominator, FI 1=2.0. It returns the quotient 
in F0. (The two highlighted instructions can be 
removed if only a ±1 LSB accurate single-precision 
result is necessary.) 


F0=RECIPS F12, F7=F0; 
F12=FQ*F12 ; 

F7=F0*F7, F0=F11-F12; 
F12=F0*F12; 

F7=F0*F7, F0=F11-F12; 


{Get 8 bit seed R0=1/D} 
{D' = D*R0} 

{F0=R1=2-D' , F7=N*R0 } 
{F12=D'-D' *R1 } 
{F7=N*R0*R1, F0=R2=2-D' } 


RTS (DB) F12=F0*F12; {F12=D'=D' *R2} 

F7=F0*F7, F0=F11-F12; {F7=N*R0*R1*R2, F0=R3=2-D' } 


F0=F0*F7; 


{F7=N*R0*Rl*R2*R3 } 


Note that this code segment can be made into a 
subroutine by adding an rts (db) clause to the 
third-to-last instruction /' 


" The following code calculates a floating-point 
reciprocal square root (1 /Vx) using a Newton- 
Raphson iteration algorithm.* The result is 
accurate to one LSB in whichever format mode, 
32-bit or 40-bit, is set (32-bit only for ADSP-21010) . 
To calculate the square root, simply multiply the 
result by the original input. This code -e x e cu t es in 
13 cycles . It requires these inputs: F0=input, 
F8=3.0, FI =0.5. It returns the result in F4. (The four 
highlighted instructions can be removed if only a 
±1 LSB accurate single-precision result is 
necessary.) 
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F4=RSQRTS FO; 
F12=F4*F4; 

F12=F12*F0; 

F4=F1*F4, F12=F8-F12; 
F4=F4*F12; 

F12=F4*F4 ; 

F12=F12*F0; 

F4=F1*F4, F12=F8-F12; 

F4=F4*F12; 

F12=F4*F4; 

RT5-( - DB f F12=F12*F0 ; 
F4=F1*F4, F12=F8-F12; 

F4=F4*F12; 


{Fetch 4-bit seed} 
{F12=X0 A 2 } 

{F12=C*X0 A 2 } 

{F4=. 5*X0, F12=3-C*X0 A 2 } 
{F4=X1=.5*X0 (3-C*X0 A 2) } 
{F12=X1 A 2 } 

{F12=C*X1 A 2 } 

{F4= . 5*X1, F12=3-C*X1 A 2 } 

{F4=X2= . 5*X1 (3-C*Xl A 2) } 
{F12=X2 A 2 } 

{F12=C*X2 A 2 } 

{F4=. 5*X2, F12=3-C*X2 A 2 } 

{F4=X3=.5*X2 (3-C*X2 A 2) } 


Note that this code segment can be made into a 
subroutine by adding an rts (db) clause to the 
third-to-last instruction / 7 


p. B-46 


In Table B.4, the following Mod2 options are 
deleted: 


ass m 

(SU1R) 

(U S IR) 

(UUIR) 


nu 

m n 

— T 

inn 

T 

J. 

— ee-e- 

~T 

-4 


p. B-52 


MR Register Transfer instruction is moved to this 
page (from p. B-81 in 1st Edition). 


p. B-55 to B-69, "SS Is no t a f fec te d 
B-72, 73 Is cleared " 


p. B-64, 66, 68, 69 'The floating-point extension field of Rn (bits 7-0 
of the 40-bit word) is set to all Os." 


p. B-64, 66, 68 New figures added for FDEP, FEXT instructions. 



p. B-71 


In the shifter operation Rn=EXP Rx (EX), the 
definition of the SS status flag is changed: 


"SS Is se t if t he fixed-poin t operand in Rx is 
negative (bi t 31 is a 1), o t herwise cleared " 

"SS Is set if the exclusive OR of the AY status 
bit and the sign bit (bit 31) of the fixed- 
point operand in Rx is equal to 1 , otherwise 
cleared " 


p. C-3, Table C.l 


p. C-9, 10 


Instruction 


Kits 


Register 


1234 

Name 

(Serial Path) 

Type 

n i -u' 

r> 1 c 1-j — 

Private 

U 1 X X 

JLVCSCl VCU iUl CULL 

nation 

0100 

Reserved for emulation 

Private 

x x 1 0 

Reserved for emulation 

Private 

Scan 

Latch 

Signal 


Position 

Type 

Name 


234 

Output Input 

FLAG3 Input Latch 


235 

Input Output 

FLAG3 Output Latch 


236 

Output Input 

FLAG2 Input Latch 


237 

Input Output 

FLAG2 Output Latch 


238 

Output Input 

FLAG1 Input Latch 


239 

Input Output 

FLAG1 Output Latch 


240 

Output Input 

FLAG0 Input Latch 


241 

Input Output 

FLAG0 Output Latch 



p. C-ll 


ft 1 = Drive the associated signals during the EXTEST 
and INTEST instructions 


0 = Tristate the associated signals during the 
EXTEST and INTEST instructions 
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p. D-l, para.l 


p. D-2, para.5 


p. E-l, Table E.l 
p. E-2, para.l 

p. E-2, para.5 
p. E-3 


"The ADSP-21020 and ADSP-21010 support two 
single-p r ecision floa t ing- p oin t da t a fo r ma t s the 32- 
bit single-precision floating-point data format 
defined in the IEEE Standard 754/854. In addition, 
the ADSP-21020 supports an extended-precision 
version of the same format with eight additional 
bits in the mantissa (40 bits total) . Both the ADSP- 
21020 and ADSP-21010 also support 32-bit fixed- 
point formats — fractional and integer — which can 
be signed (twos-complement) or unsigned. 

"The IEEE single-precision floating-point data 
types supported by the ADSP-21020 and ADSP- 
21010 and their interpretations are summarized in 
Table D.l. 

Type Exponent Fraction Value 

Normal 1 < e < 254 Any (-l) s fOdf) 2 e “ 127 

Normal 1 < e < 254 Any (-l) s (1.%^) 2^ 27 

Register values after reset are specified. 

" A t re s e t , all - sys tem- r e g i s ter s -e xc ept IRPT - E 7 
USTAT1, and USTAT2 a re- clea re d " 

" BT P- is -r ead - only ." 

In the MODE1 Register bit definitions, the 
following text is added: 

Bit Name Definition 

8-9 Reserved 

16 RND32 l=Round floating-point data to 32 bits; 

0=Round to 40 bits 

(must be set to 1 for ADSP-21010) 
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p. E-5 


In the AST AT Register bit definitions, the 
following text is deleted: 

Bit Name Definition 

18 BTF Bit test flag for system registers fl R ead - on l y ) 

and the following footnote is added: 

* Bit 17 (DAG1 circular buffer 7 overflow) and Bit 18 (DAG2 
circular buffer 15 overflow) indicate the occurrence of a 
circular buffer overflow. Rather then remaining set until 
explicitly cleared, however, these bits are cleared by the next 
subsequent memory access that uses the corresponding 1 
register (17, 115). Circular buffer interrupts, therefore, should 
be used instead of these STKY register bits. See Section 
43.2.3, "Circular Buffer Overflow Interrupts/ 7 in Chapter 4 . 
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Index H 


A 


B 


Abbreviations 


A-2 


ABS B-15, B-31, B-77 

Absolute value ....B-15, B-23, B-26, B-27, B-31, B-44 

AC flag 2-8, 3-8, A-8, B-6, B-7, B-10, B-ll 

Accumulator 2-12 

.ACHfile 8-5, 8-8 


Acknowledge; see also DMACK 

and PMACK 6-9,9-12 

Add with carry B-6 

Addition B-4, B-6, B-12, B-24, B-26, B-75, B-77 

Address decoding 8-7 

AFflag 2-9 

AI flag 2-9 

AISflag 2-7 

Alternate registers 1-8, 2-28, 4-3 

ALU 1-4, A-8 


Input operands 2-5 

Opcodes B-2, B-3 

Operations 2-5, 2-10, 7-13, B-2, B-3 

Status 2-7 


ALU saturation 

ALUSATbit 

AN flag 

AND 

AOS flag 

Architecture description file 

Arithmetic shift 

Arithmetic status 

AS flag 

ASHIFT 

Assembler 

Assembly library/librarian . 

AST AT register 

Default value at reset 

Summary 

Ureg address 

Asynchronous inputs 

Asynchronous interrupts 

AUS flag 

Autowrap 

A V flag 

Average 

A VS flag 

AZ flag 


2-6 

2—6 

2-8, B-9, B-29 

B-17 

2-8 

8-5, 8-8, 8-16 

B-57, B-58 

3-7 

2-9 

B-57, B-58 

1-10, 8-15,8-29 

1-10 

2-3,3-5,3-21,3-27 

9-5 

E-5 

A-9 

9-4 

3-28 

2-8 

8-29 

2-8, 3-8, A-8 

B-8, B-28, B-77 

2-8 

2-8, B-9, B-29 


B registers 4-1 

Default value at reset 9-5 


Ureg address 

Background MR register 

Base register 

BCLR 

BG; see also Bus grant .... 

Binary logarithm 

Bit operation 

Bit test 

Bit-reverse 

Instruction 

Mode 

Bit6 

BITREV instruction 

Booting 

Borrow 

Boundary register 

Boundary scan 


A-9 

2-12 

4-1 

B-60 

6-13 

B-36 

...7-20, A-4, A-42, E-2 

A-8, B-63, E-2 

4-9, A-42 

4-10, A-42 

4-9, 6-8, 7-8 

2-20, A-2 

4-10, 6-10, 7-20, A-42 

9-25 

B-7, B-l 1 

C-4 

C-l 


BRObit 4-9 

BR; see also Bus request/bus grant 6-13, 9-3 

Branch 3-6, 3-9, A-4, A-24, A-26 


BSET B-61 

BTF flag 3-7, A-42, E-2 

BTGL B-62 

BTST B-63 

Buffer latches 9-20 

Built-in self-test (BIST) C-ll 

Bus exchange (PX registers) 6-15 

Bus grant 3-23 

Bus request/bus grant 


Timing 


1-8, 6-6, 6-13, 7-5, 9-2, 9-16 
6-14 


C 


C compiler 
CACC 


1-10 

2-7, E-5 


Cache, instruction 

Efficiency 

3-30 

3-32 

Enable /disable 

3-32 

Freeze 

3-32 

Cache, external 

6-6,9-13 

CADIS bit 

3-32 

CAFRZ bit 

3-32 


Call 3-6, 3-9, A-4, A-24, A-26 


Capacitive loads 


9-27 


X-1 





Carry A-8, B-6, B-10 

CB71 3-24 

CB151 3-24 

Circular buffers 4-1, 4-6, 7-8 

Clear bit A-4, A-42, B-60 

Clear MR 2-14, B-52 

CLIP B-23, B-44 

CLKIN 9-1, 9-3 

CLR A-42 

Compare accumulation; 

see also CACC 2-9, B-9, B-29 

Compare B-9, B-29 

Complement sign B-14, B-30 

Computation unit 1-4, 2-2, 7-9, A-4 

Computation unit register A-3 

Compute field B-l 

Compute operation A— 1, B— 1 

Conditions 3-7 

Codes 3-8, 7-11, A-4, A-8 

Mnemonics 3-8, A-8 

Conditional branch 3-9 

Conditional instruction A-l 

Context switch 1-8, 2-28, 4-3 

Conventions, notation A-2 

Conversion 

Fixed-point-to-floating-point B-38, B-77 

Floating-point-to-fixed-point B-37, B-77 

COPYSIGN B-41 

Counter A-4, A-5 

Counter-based loops 3-15, 7-3, 8-31 

CURLCNTR 3-5, 3-7, 3-18, 7-6, 7-8 

Default value at reset 9-5 

Ureg address A-9 


D 


DADDR 

Default value at reset .. 

Ureg address 

Data address generator .... 

DAG architecture 

DAG register transfers 

DAG registers 

DAG restrictions 

DAG1 

DAG2 


3-5 

9-5 

A-9 

1-6, 4-1 

4-2 

4-10 

4-1, 7-4, 7-8, 8-12 

4-11 

4-1, A-4 

4-1 


Data memory access 6-4, 6-5, A-12, A-l 4, 

A-l 6, A-20, A-34, A-35, A-36 


Data memory address hold time 6-10 

Data memory interface . 6-3 

Data memory read cycle 6-4 

Data memory write cycle 6-5 

DB; see also Delayed branch 

3-10, A-5, A-24, A-26 

Decode cycle 3-2 

Decrement B-l 3 

DEF21020.H 8-21,8-22 

Delayed branch 3-9, 3-11, 3-22, 7-5, 8-39, A-5 

Denormal numbers 2-3 


Deposit field 

Development software 

Device identification register .. 

Direct branch 

Direct memory access (DMA) 
Division 


B-64, B-65, B-66, B-67 
1-9, 8-1 


C-ll 

3-9, A-24 

9-16 

B-39 


DMA bus 


1-7 


DMA controller 9-16 

DMA31-0 6-3, 6-7, 9-1, 9-4 

DMACK 6-3, 6-7, 6-9, 9-2, 9-12, 9-13, 9-14 

DMADR 4-10,4-12 

Default value at reset 9-5 

Ureg address A-10 

DMBANK1 4-12, 6-8, 8-12 

Default value at reset 9-5 

Ureg address A-10 

DMRANK2 4-12, 6-8, 8-12 

Default value at reset 9-5 

Ureg address A-10 

DMBANK3 4-12, 6-8, 8-12 

Default value at reset 9-5 

Ureg address A-10 

DMD bus 1-7, 6-15 


DMD39-0 6-3, 6-7, 9-1, 9-4 

DMPAG E 6-3, 6-7, 6-12, 9-2, 9-4 

DMRP 6-3, 6-7, 9-1, 9-4, 9-10, 9-31 

DM50 4-10,7-8,9-9 

DMST- 'O 6-3, 6-7, 6-8, 8-7, 9-1, 9-4 

DMT5 6-3, 6-6, 9-2, 9-13, 9-14 

DMWAIT 4-12, 6-10, 8-12, 8-23 


Default value at reset 9-5 


Summary E-ll 

Ureg address A-10 

DMWR 6-3, 6-7, 9-1, 9-4, 9-9 

DO UNTIL instruction 3-6, 3-13, 3-19, 8-14, 8-31 

DRAM interface 9-14 


Dreg 

Dual add /subtract 

Dual-port RAM 

Dynamic RAM (DRAM) 


A-2 


, 2-26, B-75, B-80 

9-24 

6-12,9-2 


E 


Edge-sensitive interrupts 3-27 

Effect latency 3-5, 7-7, E-2 

Emulator 1-11 

.ENDSYS directive 8-7 

EQ 3-8, A-8 

EX (extended exponent) B-71 

Exceptions 2-3 

Execute cycle 3-2 

EXP 2-24, B-70, B-71 

Exponent extraction B-36, B-70, B-71 

Extended floating-point format 2-3, D-2 

.EXTERN directive 8-37 

External interrupts 3-20, 3-27 

Extra cycle conditions 7-2 

Extract field B-68, B-69 

EZ-ICE® emulator 1-11, 9-30 
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Index 


F 


F3-0 A-3 

F7-4 A-3 

FI 1-8 A-3 

FI 5-12 A-3 

Fa A-2 

FADDR 3-5 

Default value at reset 9-5 


Ureg address A-9 

FDEP B-64, B-65, B-66, B-67 

Fetch cycle 3-2 

FEXT B-68, B-69 

Field deposit B-64, B-65, B-66, B-67 

Field extract B-68, B-69 

FIFOs 9-22 

Filter coefficients 8-35, 8-36 

FIX B-37 

Fixed-point format 2-4, D-3 

Fixed-point-to-floating-point B-38 

FIXI 3-24 

Flags (FLAG3-0) 3-27, 9-3, 9-7, 9-21, A-8 

Direction 9-7 


Timing 9-8 

Value 9-7 


FLAGOJN 3-8, A-8 

FLAG1JN 3-8, A-8 

FLAG2JN 3-8, A-8 

FLAG3JN 3-8, A-8 

FLGO 9-7 

FLGOO 9-7 

FLG1 9-7 

FLGIO 9-7 

FLG2 9-7 

FLG20 9-7 

FLG3 9-7 

FLG30 9-7 

FLOAT B-38 

Floating-point format 1-3, D-l, D-2 

Floating-point precision 2-3, D-2 

Floating-point-to-fixed-point B-37 

FLTII 3-24 

FLTOI 3-24 

FLTUI 3-24 


Fm 


A-2 


Fn 


A-2 


Foreground MR register 2-12 

FOREVER 3-8, A-8 

Fractional format B-46, D-3 

Fractional result 2-12, D-4 

Fs A-2 


Fx 


A-2 


Fy 


A-2 


G-H 

GE 3-8, A-8 

GENERIC. ACH 8-6 

.GLOBAL directive 8-31, 8-37 


GT 3-8, A-8 

High-level programming language 1-2 

Host processor 9-32 


I 


I registers 4-1 

Default value at reset 9-5 

Ureg address A-9 

I/O devices 9-11 

la A-3 


Ic A-3 

IDLE instruction 3- 29, 7-20, 8-1, 8-14, 8-29, A-46 

Idle state 3-1,3-23,3-29 

IEEE 754/854 standard 2-2, D-l 

IEEE 1149.1 specification 9-25, C-l 

IF A-l 


HR filter 8-35 

IIRCOEFS.DAT 8-36 

IIRIRQ.ACH 8-17 

IIRIRQ.ASM 8-19,8-20 

IIRMEM.ASM 8-10 

IMASK 3-5,3-25,8-27 

Default value at reset 9-5 


Summary E-8 

Ureg address A-9 

IMASKP 3-5,3-21,3-26 

Default value at reset 9-5 


Ureg address A-9 

Immediate address 7-20, A-34 

Immediate data 7-20, A-l, A-36, A-3 7 

Immediate modify 7-20, 4-6, A-l 6, A-35, A-42 

Immediate shift operation A-2, A-20, B-54 

Increment B-12 

Index register 4-1, A-4, A-6 

Indirect addressing 4-1 

Indirect branch 3-9, A-26 

Inexact flags 2-2 

Infinity D-2 

Initial setups 8-8, 8-19 

Input operands 

ALU 2-5 


Multiplier 2-11 

Shifter 2-19 

Instruction cache 1-6, 3-6, 3-30 

Instruction groups 7-1, A-l, A-ll 

Instruction pipeline 

1-6, 3-3, 3-5, 3-12, 3-13, 3-15, 3-30 

Instruction register C-2 

Instruction set 1-8, 7-1 

Instruction type 7-1, A-l 

Integer format B-46, D-3 

Integer result 2-12, D-4 

Interrupt-driven operation 8-16 

Interrupt-driven data transfers 9-37 
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Interrupts 

1—7, 3-1, 3-20, 6-14, 7-6, 8-11, 8-13, 8-26 

External 3-20, 3-27 

Latency 3-21 

Masking 3-25 

Priority 3-20, 3-21, 3-26 

Sensitivity 3-27 

Service routine 3-21 

Vectors 3-21, 7-17, 8-11, A-5 

IRPTEN bit 3-25 

IRPTL 3-5, 3-21, 3-23, 3-24, 3-25, 7-7, 8-28 

Default value at reset 9-5 

Summary E-8 

Ureg address A-9 

IRQOE 3-28 

IRQOI 3-24 

IRQ1E 3-28 

IRQll 3-24 

IRQ2E 3-28 

IRQ2I 3-24 

1RQ3-0; see also Interrupts, external 9-3 

IRQ3E 3-28 

IRQ31 3-24 


J-K-L 


JTAG C-l 

Jump 3-1, 3-b, A-4, A-5, A-24, A-26 

L registers 4-1, 8-12 

Default value at reset 9-5 

Ureg address A-9 

LA; see also Loop abort 3-9, A-3, A-24, A-26 


LADDR 3-5, 3-18 

Default value at reset 9-5 


Ureg address A-9 

LCE 3-8, 3-19, 7-8, A-8 

LCNTR 3-5, 3-19, 8-14, 8-31 

Default value at reset 9-5 


Ureg address A-9 

LE 3-8, A-8 

Leading ones B-73 

Leading zeros B-72 

LEFTO B-73 

LEFTZ B-72 


Len6 2-20, A-2 

Length register 4-1, 8-12 

Level-sensitive interrupts 3-27 

Linker 1-10,8-15,8-29 

Load variations 9-27 

Loader program 9-26 

LOGB B-36 

Logical shift B-55, B-56 

Loops 3-1, 3-6, 3-13, 7-2, 8-14, 8-31 

Loop-back 3-14 

Nesting 3-14, 3-17 

Restrictions 3-14, 7-6 

Stacks A-3, A-5, A-b, A-44 

Termination 3-14 


Loop abort 

Loop address stack 

Loop counter 

Loop counter stack 

Looped code 

Low power 

LRU bit 

LSEM 

LSHIFT 

LSOV 

LT 


3-9, 3-14, 3-18, A-3 

3-17 

3-18, A-8 

3-18 

8-31 

A-46 

3-30 

3-28 

B-55, B-56 

3-28 

3-8, A-8 


M 

M registers 

Default value at reset 

Ureg address 

MAN! 

Mantissa extraction 

Map 1 

Map 2 

MAX 

Maximum 

Mb 

Md 

Memory 

Access 

Banks 

Configurations 

Initialization 

Interface capacitive load 

Maps 

Paging 

Segments 

Memory-mapped I/O 

Memory-resident data 

MI flag 

MIN 

Minimum 

MIS flag 

Miscellaneous instructions . 

MN flag 

Modi 

Mod2 

MODE1 

Default value at reset 

Summary 

Ureg address 

MODE2 

Default value at reset 

Summary 

Ureg address 

Modify operation 

Modify register 

Modify, immediate 

Modulo addressing 

MOS flag 

MR clear 


4-1 

9-5 

A-9 

B-35 

B-35 

A-9 

A-10 

B-22, B-43, B-77 

B-22, B-43 

A-3 

A-3 


A-4, A-5 

....6-8,8-7,9-9,9-10 

9-8 

8-13,8-25 

9-27 

8-9,8-18 

6-12 

8-7 

9-11 

8-2 

2-17 

B-21, B-42, B-77 

B-21, B-42 

2-16 

A-l, A-39 

2-17 

B-46 

B-46 

3-5, 3-21, 3-27, 8-27 

9-5 

E-4 

A-9 

3-5,8-27 

9-5 

E-3 

A-9 

7-20, A-22, A-42 

4-1, A-5, A-6 

A-16 

4-6 

2-17 

B-52 


X-4 



MR register 2-12, B-l 

MR register transfer 2-13, A-l, B-52 

MR rounding B-51 

MR saturation B-50 

MRO register 2-12 

MROB A-3, B-52 

MROF A-3, B-52 

MR1 register 2-12 

MR1B A-3, B-52 

MR1F A-3, B-52 

MR2 register 2-12 

MR2B A-3, B-52 

MR2F A-3, B-52 

MRB 2-12 

MRF 2-12 

MS flag 3-8, A-8 

MU flag 2-16 

Multifunction computations 

.... 1-5, 1-8, 2-26, 7-16, 8-34, 8-40, A -7, B-l, B-74 

Multiplication, fixed-point B-47 

Multiplication, floating-point B-53 

Multiplier 1-4, A-8 

Input operands 2-1 1 

Opcodes B-45 

Operations 2-18, 7-14, B-45 

Status 2-16 

Multiplier result (MR) 2-12, A-l, B-52 

Multiplier/ ALU operation B-77 

Multiply /accumulate 2-11, B^18, B-49 

Multiport memory 9-18 

Multiprocessor configurations 9-18 

MUS flag 2-16 

MV flag 2-17,3-8, A-8 

MVS flag 2-17 

N 

NAN (Not-A-Number) 2-2, D-2 

NE 3-8, A-8 

Negate B-l 4, B-30 

Nested loops 3-21, 3-26, 7-6 

NESTM bit 3-21,3-26 

Nondelayed branch 3-9, 3-10, 7-2, A-5 

NOP 7-20, A-45 

NOT AC 3-8, A-8 

NOT AV 3-8, A-8 

NOT B-20 

NOT FLAGO _IN 3-8, A-8 

NOT FLAG1 _IN 3-8, A-8 

NOT FLAG2JN 3-8, A-8 

NOT FLAG3JN 3-8, A-8 

NOT LCE 3-8, A-8 

NOT MS 3-8, A-8 

NOT MV 3-8, A-8 

NOT SV 3-8, A-8 

NOT SZ 3-8, A-8 

NOT TF 3-8, A-8 

Notation conventions A-2 

Numerical C compiler 1-10 


0-P-Q 


Opcodes A-l, A-5 

ALU B-2, B-3 

Multiplier B-45 

Notation A-3 

Shifter B-54 

OR B-l 8, B-56, B-58, B-65, B-67 

Overflow A-8 

Page boundary detection 6-12, 7-5 

Page size 6-12, 6-13 

Parallel memory accesses A-l 2 

Parallel multiplier/ ALU operation 2-26 

PASS B-l 6, B-32 

PC 3-3, 3-5, 9-5 

PC register address A-9 

PC stack 3-12, 3-13, 3-21, A-3 

PC stack pointer 3-13 

PC-relative address A-2, A-6, A-24, A-26 

PC-relative branch 3-9 


PCEM 3-28 

PCFL 3-28 

PCSTK (PC stack) 3-5, 3-12 

Default value at reset 9-5 

Ureg address A-9 

PCSTKP (PC stack pointer) 3-5, 3-13 

Default value at reset 9-5 


Ureg address A-9 

PMA bus 1-7 


PMA23-0 6-2, 6-7, 9-1, 9-4 

PMACK 6-2, 6-7, 6-9, 9-2 

PMADR 4-12 


Default value at reset 


9-5 


Ureg address 

PMBANK1 

Default value at reset 

Ureg address 

PMD bus 

PMD47-0 

PMPAG E 

PMRD 

PMS1- 0 

PMT5 

PMWAIT 

Default value at reset 


A-10 

4-12,6-8 

9-5 

A-10 

1-7, 6-15 

6-2, 6-7, 9-1, 9-4 

6-2, 6-7, 6-12,9-2,9-6 
6-2, 6-7, 9-1, 9 -\, 9-31 
..6-2, 6-7, 6-8, 9-1, 9-6 

6-2, 6-6, 9-2 

4-12, 6-10,8-23 

9-5 


Summary 

Ureg address 

PMWR 

Pointers 

Pop loop stack 

Pop stack 

Post-modify 

Powerup 

Pre-modify 

Private instructions .... 

Probe connector 

Program counter (PC) 
Program flow 


E-10 

A-10 

6-2, 6-7, 9-1, 9-4 

4-1 

3-18 

7-20, A-44 

4-4 

9-4,9-25 

4-4, A-26 

C-ll 

9-30 

3-3 

3-2, 7-19, A-l, A-23 



Index 


Program memory access 6-4, 6-5, A-12, 

A-14, A-16, A-20, A-34, A-35, A-36 

Program memory boot 9-25 

Program memory data access 3-6, 3-23, 3-30, 7-2 

Program memory interface 6-2 

Program memory read cycle 6-4 

Program memory write cycle 6-5 

Program sequencer 1-6, 3-1 

Architecture 3-3 

Registers 3-5 

Programmable wait states 6-9 

Programming 7-2, 8-1, 8-37 

PROM splitter 1-10 

Pullup resistors 9-9, 9-13 

Push loop stack 3-18 

Push stack 7-20, A-44 

PX 6-15 

Default value at reset 9-5 

Ureg address A-10 

PX1 6-15 

Default vafue at reset 9-5 

Ureg address A-10 

PX2 6-15 

Default value at reset 9-5 

Ureg address A-10 


R 


R3-0 A-2 

R7-4 A-2 

Rll-8 A-2 

R15-12 A-3 

Ra A-2 

Read latency 3-5, 7-7 

Reciprocal seed B-39 

Reciprocal square root B-40 

RECIPS B-39 


Register file 2-1, 2-27, 7-9, 9-5, A-5, A-6 

Register transfers 1-8, 4-10, 5-4, 6-15, A-18, B-52 

RESET 9-1, 9-3, 9-4 

Reset 6-14, 8-8, 9-4, 9-25 

Return 3-6, 3-9 

Return address 3-21 


Rm 

Rn 

RND 

RND32 bit 

Rolling loops 

ROT 

Rotate 

Round MR .... 

Rounding 

Boundary 

Modes 

Rs 

RSQRTS 

RSTI 

Rx 


A-2 

A-2 

B-33 

2-3,2-7,2-15 

8-32 

B-59 

B-59 

2-14 

B-33, B-46, B-51 

2-15 

2-4,2-6,2-15 

A-2 

B-40 

3-24 

A-2 


RXA A-7, B-78, B-79 

RXM A-7, B-78, B-79 

Ry A-2 

RYA A-7, B-78, B-79 

RYM A-7, B-78, B-79 


S 


Saturate MR 2-14 

Saturation 2-6, 2-14, B-50 

SC ALB B-34 

Scaling B-34 

Scope of variables 8-37 

SE (sign extension) B-66, B-67, B-69 

Segments 8-7 

Serial data flow 9-20 

Serial scan path 1-3, C-4 

5 e *; {-jjt a_4 A-40 B-61 

SFT0I .! !.3-24 

SFT1I 3-24 

SFT2I 3-24 

SFT3I 3-24 

SFT4I 3-24 

SFT5I 3-24 

SFT6I 3-24 

SFT7I 3-24 

Shf8 2-20 

Shifter 1-4, A-8 

Fields 2-20 

Input operands 2-19 

Opcodes B-54 

Operations 2-25, 7-15, B-54 

Shifter immediate operation A-2, A-20, B-54 

Shif timm A-2, A-20 

Short loops 3-14, 7-3 

Signed format B-46, D-3, D-4 

Simulator 1-10, 8-15, 8-29 

Single-function operation B-l 

Software interrupts 3-25 

SOVFI 3-24 

Square root B-40 

SRCU bit 2-13 

SRD1H bit 4-4 

SRD1L bit 4-4 

SRD2H bit 4-4 

SRD2L bit 4-4 


Sreg; see also System register A-2, A-6, A-40 

SRRFH bit 2-28 

SRRFL bit 2-28 

SS flag 2-24 

SSEM 3-28 

SSOV 3-28 

Stack flags 3-28,7-9 

Stack operation A-44 

Stack overflow 3-18, 3-28 

Static RAM (SRAM) 9-9 


Status flags 2-3 

Status stack 3-21, 3-26, A-5, A-6, A-44 


X-6 



Sr* *<3 
II IU 



J% 


STKY register 2-4, 3-5, 3-23, 3-25 

Default value at reset 9-5 

Summary E-6 

Ureg address A-9 

Subroutines 3-1 

Subtract with borrow B-7 

Subtraction B-5, B-7, B-25, B-27, B-75, B-77 

SV flag 2-24, 3-8, A-8 

Synchronization delay 9-3 

Syntax A-l 

Syntax notation 7-10, A-2 

.SYSTEM directive 8-7 

System registers 

3-5, 7-12, A-2, A-6, A-9, A-40, E-l 

SZ flag 2-24, 3-8, A-8, B-63 

T 

TCK 9-3,9-30, C-l,C-2 

TCOUNT 5-1 

Default value at reset 9-5 

Ureg address A-10 

TDI 9-3, 9-30, C-2 

TDO 9-3,9-4,9-30, C-2 

Termination address (for loops) 3-17 

Termination code (for loops) 3-17, 7-11 

Test access port (TAP) C-2 

Test bit A-4, A-40 

TF 3-8, A-8 

TGL A-40 

Three-state enable 1-8, 6-6, 7-5, 7-9 

TIMEN bit 5-2,8-27 

Timer 1-7, 3-30, 5-1, 8-25 

Enable /disable 5-1 

Interrupt 3-25, 5-4 

TIMEXP 5-1, 9-3, 9-4 

TMS 9-3, 9-30, C-2 

TMZHI 3-24, 5-4, 8-27 

TMZLI 3-24, 5-4, 8-27 

Toggle bit A-4, A-40, B-62 

TPERIOD 5-1 

Default value at reset 9-5 

Ureg address A-10 

TK5T 9-3, 9-30, C-2 

TRUE 3-8, A-8 

TRUNC bit 2-4,2-6,2-15 

TST A-40 


U-V 

Universal register 

1-8, 7-12, A-2, A-6, A-9, A-l 8, A-37 


Unsigned format B-46, D-3, D-4 

Ureg; see also Universal register A-2, A-6, 

User status registers E-2 

USTAT1 3-5, E-2 

Default value at reset 9-5 

Ureg address A-9 

USTAT2 3-5, E-2 

Default value at reset 9-5 

Ureg address A-9 

Valid bit 3-30 

.VAR directive 8-37 

Vector A-5 

W-X-Y-Z 

Wait state modes 6-9 

Wait states 6-7, 6-9, 6-13, 7-5, 7-9, 8-23, 9-13 

XOR bit A-4, A-40 

XOR B-19 

Zero D-2 


X-7 
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